Data quantity governance for machine learning in materials science
Shanghai University of Engineering Science · Shanghai Institute of Computing Technology · +3 more institutions
Abstract
Data-driven machine learning (ML) is widely employed in the analysis of materials structure-activity relationships, performance optimization and materials design due to its superior ability to reveal latent data patterns and make accurate prediction. However, because of the laborious process of materials data acquisition, ML models encounter the issue of the mismatch between a high dimension of feature space and a small sample size (for traditional ML models) or the mismatch between model parameters and sample size (for deep-learning models), usually resulting in terrible performance. Here, we review the efforts for tackling this issue via feature reduction, sample augmentation and specific ML approaches, and…
Citation impact
- FWCI
- 16.43
- Percentile
- 100%
- References
- 105
Authors
7- YLYue LiuCorresponding
Shanghai University of Engineering Science, Shanghai Institute of Computing Technology
- ZYZhengwei Yang
Shanghai University of Engineering Science
- XZXinxin Zou
Shanghai University of Engineering Science
- SMShuchang Ma
Shanghai University of Engineering Science
- DLDahui Liu
Shanghai University of Engineering Science
Topics & keywords
- Computer science
- Process (computing)
- Sample (material)
- Corporate governance
- Feature (linguistics)
- Domain (mathematical analysis)
- Dimension (graph theory)
- Domain knowledge