Variational Information Distillation for Knowledge Transfer
Korea Advanced Institute of Science and Technology · Kootenay Association for Science & Technology · +4 more institutions
Abstract
Transferring knowledge from a teacher neural network pretrained on the same or a similar task to a student neural network can significantly improve the performance of the student neural network. Existing knowledge transfer approaches match the activations or the corresponding hand-crafted features of the teacher and the student networks. We propose an information-theoretic framework for knowledge transfer which formulates knowledge transfer as maximizing the mutual information between the teacher and the student networks. We compare our method with existing knowledge transfer methods on both knowledge distillation and transfer learning tasks and show that our method consistently outperforms existing methods.…
Citation impact
- FWCI
- 37.58
- Percentile
- 100%
- References
- 60
Authors
5- SASungsoo AhnCorresponding
Korea Advanced Institute of Science and Technology, Kootenay Association for Science & Technology
- SXShell Xu Hu
École nationale des ponts et chaussées, Euclid Network
- ADAndreas Damianou
Amazon (United Kingdom), Amazon (Germany)
- NDNeil D. Lawrence
Amazon (Germany), Amazon (United Kingdom)
- ZDZhenwen Dai
Amazon (Germany), Amazon (United Kingdom)
Topics & keywords
- Computer science
- Convolutional neural network
- Transfer of learning
- Artificial intelligence
- Knowledge transfer
- Distillation
- Artificial neural network
- Machine learning
- Quality Education