Correlation Congruence for Knowledge Distillation
National University of Defense Technology · Group Sense (China) · +2 more institutions
Abstract
Most teacher-student frameworks based on knowledge distillation (KD) depend on a strong congruent constraint on instance level. However, they usually ignore the correlation between multiple instances, which is also valuable for knowledge transfer. In this work, we propose a new framework named correlation congruence for knowledge distillation (CCKD), which transfers not only the instance-level information but also the correlation between instances. Furthermore, a generalized kernel method based on Taylor series expansion is proposed to better capture the correlation between instances. Empirical experiments and ablation studies on image classification tasks (including CIFAR-100, ImageNet-1K) and metric learning…
Citation impact
- FWCI
- 33.22
- Percentile
- 100%
- References
- 67
Authors
8Topics & keywords
- Distillation
- Correlation
- Congruence (geometry)
- Computer science
- Metric (unit)
- Artificial intelligence
- Machine learning
- Kernel (algebra)
- Quality Education