A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning

Yim, Junho; Joo, Donggyu; Bae, Ji‐Hoon; Kim, Junmo

doi:10.1109/cvpr.2017.754

articleJul 1, 2017Closed access

A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning

JYJunho Yim DJDonggyu Joo JBJi‐Hoon Bae JKJunmo Kim

Korea Advanced Institute of Science and Technology · Electronics and Telecommunications Research Institute

Indexed incrossref

Abstract

We introduce a novel technique for knowledge transfer, where knowledge from a pretrained deep neural network (DNN) is distilled and transferred to another DNN. As the DNN performs a mapping from the input space to the output space through many layers sequentially, we define the distilled knowledge to be transferred in terms of flow between layers, which is calculated by computing the inner product between features from two layers. When we compare the student DNN and the original network with the same size as the student DNN but trained without a teacher network, the proposed method of transferring the distilled knowledge as the flow between two layers exhibits three important phenomena: (1) the student DNN…

Citation impact

1,642

total citations

FWCI: 90.09
Percentile: 100%
References: 43

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Computer science
Artificial neural network
Task (project management)
Artificial intelligence
Transfer of learning
Knowledge transfer
Distillation
Scratch

UN Sustainable Development Goals

Quality Education

No related works found for this paper.