A Comprehensive Overhaul of Feature Distillation

Heo, Byeongho; Kim, Jeesoo; Yun, Sangdoo; Park, Hyojin; Kwak, Nojun; Choi, Jin Young

doi:10.1109/iccv.2019.00201

articleOct 1, 2019Closed access

A Comprehensive Overhaul of Feature Distillation

BHByeongho Heo JKJeesoo Kim SYSangdoo Yun HPHyojin Park NKNojun Kwak

Naver (South Korea) · Seoul National University

Indexed incrossref

Abstract

We investigate the design aspects of feature distillation methods achieving network compression and propose a novel feature distillation method in which the distillation loss is designed to make a synergy among various aspects: teacher transform, student transform, distillation feature position and distance function. Our proposed distillation loss includes a feature transform with a newly designed margin ReLU, a new distillation feature position, and a partial L 2 distance function to skip redundant information giving adverse effects to the compression of student. In ImageNet, our proposed method achieves 21.65% of top-1 error with ResNet50, which outperforms the performance of the teacher network, ResNet152.…

Citation impact

636

total citations

FWCI: 21.84
Percentile: 100%
References: 56

Citations per year

Authors

6

Topics & keywords

Topics

Keywords

Distillation
Computer science
Feature (linguistics)
Margin (machine learning)
Feature extraction
Artificial intelligence
Position (finance)
Code (set theory)

UN Sustainable Development Goals

Quality Education

No related works found for this paper.