articleOct 1, 2019Closed access

A Comprehensive Overhaul of Feature Distillation

Naver (South Korea) · Seoul National University

Indexed incrossref

Abstract

We investigate the design aspects of feature distillation methods achieving network compression and propose a novel feature distillation method in which the distillation loss is designed to make a synergy among various aspects: teacher transform, student transform, distillation feature position and distance function. Our proposed distillation loss includes a feature transform with a newly designed margin ReLU, a new distillation feature position, and a partial L 2 distance function to skip redundant information giving adverse effects to the compression of student. In ImageNet, our proposed method achieves 21.65% of top-1 error with ResNet50, which outperforms the performance of the teacher network, ResNet152.…

Citation impact

636
total citations
FWCI
21.84
Percentile
100%
References
56
Citations per year

Authors

6

Topics & keywords

Keywords
  • Distillation
  • Computer science
  • Feature (linguistics)
  • Margin (machine learning)
  • Feature extraction
  • Artificial intelligence
  • Position (finance)
  • Code (set theory)
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.