articleNational Science ReviewMay 1, 2023GOLD OA

Data quantity governance for machine learning in materials science

Shanghai University of Engineering Science · Shanghai Institute of Computing Technology · +3 more institutions

PubMed
Indexed incrossrefdoajpubmed

Abstract

Data-driven machine learning (ML) is widely employed in the analysis of materials structure-activity relationships, performance optimization and materials design due to its superior ability to reveal latent data patterns and make accurate prediction. However, because of the laborious process of materials data acquisition, ML models encounter the issue of the mismatch between a high dimension of feature space and a small sample size (for traditional ML models) or the mismatch between model parameters and sample size (for deep-learning models), usually resulting in terrible performance. Here, we review the efforts for tackling this issue via feature reduction, sample augmentation and specific ML approaches, and…

Citation impact

181
total citations
FWCI
16.43
Percentile
100%
References
105
Citations per year

Authors

7

Topics & keywords

Keywords
  • Computer science
  • Process (computing)
  • Sample (material)
  • Corporate governance
  • Feature (linguistics)
  • Domain (mathematical analysis)
  • Dimension (graph theory)
  • Domain knowledge
No related works found for this paper.

Funding