A critical assessment of using ChatGPT for extracting structured data from clinical notes
The University of Texas Southwestern Medical Center · Southwestern Medical Center
Abstract
Existing natural language processing (NLP) methods to convert free-text clinical notes into structured data often require problem-specific annotations and model training. This study aims to evaluate ChatGPT's capacity to extract information from free-text medical notes efficiently and comprehensively. We developed a large language model (LLM)-based workflow, utilizing systems engineering methodology and spiral "prompt engineering" process, leveraging OpenAI's API for batch querying ChatGPT. We evaluated the effectiveness of this method using a dataset of more than 1000 lung cancer pathology reports and a dataset of 191 pediatric osteosarcoma pathology reports, comparing the ChatGPT-3.5 (gpt-3.5-turbo-16k)…
Citation impact
- FWCI
- 24.84
- Percentile
- 100%
- References
- 22
Authors
14- JHJingwei HuangCorresponding
The University of Texas Southwestern Medical Center
- DMDonghan M. Yang
The University of Texas Southwestern Medical Center
- RRRuichen Rong
The University of Texas Southwestern Medical Center
- KNKuroush Nezafati
The University of Texas Southwestern Medical Center
- CTColin Treager
The University of Texas Southwestern Medical Center
Topics & keywords
- Computer science
- Data science
- Peace, Justice and strong institutions
Funding
- CPCancer Prevention and Research Institute of TexasAward: RP230330
- DODivision of Intramural Research, National Institute of Allergy and Infectious DiseasesAward: U01AI169298
- NINational Institutes of HealthAwards: RP230330, U01AI169298, R01GM140012, R01DE030656, R01GM141519, P50CA70907, U01CA249245, RP180805, R35GM136375
- NCNational Cancer InstituteAwards: U01CA249245, P50CA70907
- NINational Institute of General Medical SciencesAwards: R01GM140012, R35GM136375, R01GM141519
- NINational Institute of Dental and Craniofacial ResearchAward: R01DE030656