Learning from massive noisy labeled data for image classification
Chinese University of Hong Kong · Baidu (China)
Abstract
Large-scale supervised datasets are crucial to train convolutional neural networks (CNNs) for various computer vision problems. However, obtaining a massive amount of well-labeled data is usually very expensive and time consuming. In this paper, we introduce a general framework to train CNNs with only a limited number of clean labels and millions of easily obtained noisy labels. We model the relationships between images, class labels and label noises with a probabilistic graphical model and further integrate it into an end-to-end deep learning system. To demonstrate the effectiveness of our approach, we collect a large-scale real-world clothing classification dataset with both noisy and clean labels.…
Citation impact
- FWCI
- 48.97
- Percentile
- 100%
- References
- 45
Authors
5Topics & keywords
- Computer science
- Artificial intelligence
- Convolutional neural network
- Probabilistic logic
- Graphical model
- Pattern recognition (psychology)
- Machine learning
- Scale (ratio)