Harnessing Large Language Models to Collect and Analyze Metal–Organic Framework Property Data Set
Korea Advanced Institute of Science and Technology
Abstract
This research focused on the efficient collection of experimental metal-organic framework (MOF) data from scientific literature to address the challenges of accessing hard-to-find data and improving the quality of information available for machine learning studies in materials science. Utilizing a chain of advanced large language models (LLMs), we developed a systematic approach to extract and organize MOF data into a structured format. Our methodology successfully compiled information from more than 40,000 research articles, creating a comprehensive and ready-to-use data set. Specifically, data regarding MOF synthesis conditions and properties were extracted from both tables and text and then analyzed.…
Citation impact
- FWCI
- 29.03
- Percentile
- 100%
- References
- 54
Authors
6- YKYeonghun Kang
Korea Advanced Institute of Science and Technology
- WLWonseok Lee
Korea Advanced Institute of Science and Technology
- TBTaeun Bae
Korea Advanced Institute of Science and Technology
- SHSeunghee Han
Korea Advanced Institute of Science and Technology
- HJHuiwon Jang
Korea Advanced Institute of Science and Technology
Topics & keywords
- Chemistry
- Property (philosophy)
- Data set
- Set (abstract data type)
- Programming language
- Artificial intelligence
- Computer science