Relational Knowledge Distillation
Korea Post · Pohang University of Science and Technology · +2 more institutions
Abstract
Knowledge distillation aims at transferring knowledge acquired in one model (a teacher) to another model (a student) that is typically smaller. Previous approaches can be expressed as a form of training the student to mimic output activations of individual data examples represented by the teacher. We introduce a novel approach, dubbed relational knowledge distillation (RKD), that transfers mutual relations of data examples instead. For concrete realizations of RKD, we propose distance-wise and angle-wise distillation losses that penalize structural differences in relations. Experiments conducted on different tasks show that the proposed method improves educated student models with a significant margin. In…
Citation impact
- FWCI
- 77.19
- Percentile
- 100%
- References
- 79
Authors
4- WPWonpyo ParkCorresponding
Korea Post, Pohang University of Science and Technology, Kao Corporation (Japan)
- DKDongju Kim
Pohang University of Science and Technology, Korea Post, Kao Corporation (Japan)
- YLYan Lu
Pohang University of Science and Technology, Microsoft Research Asia (China), Kao Corporation (Japan)
- MCMinsu Cho
Pohang University of Science and Technology, Korea Post, Kao Corporation (Japan)
Topics & keywords
- Margin (machine learning)
- Distillation
- Benchmark (surveying)
- Metric (unit)
- Computer science
- Statistical relational learning
- Artificial intelligence
- Machine learning
- Quality Education