Large language models to identify social determinants of health in electronic health records
Brigham and Women's Hospital · Harvard University · +6 more institutions
Abstract
Social determinants of health (SDoH) play a critical role in patient outcomes, yet their documentation is often missing or incomplete in the structured data of electronic health records (EHRs). Large language models (LLMs) could enable high-throughput extraction of SDoH from the EHR to support research and clinical care. However, class imbalance and data limitations present challenges for this sparsely documented yet critical information. Here, we investigated the optimal methods for using LLMs to extract six SDoH categories from narrative text in the EHR: employment, housing, transportation, parental status, relationship, and social support. The best-performing models were fine-tuned Flan-T5 XL for any SDoH…
Citation impact
- FWCI
- 85.39
- Percentile
- 100%
- References
- 58
Authors
15- MGMarco Guevara-VegaCorresponding
Brigham and Women's Hospital, Harvard University, Dana-Farber Cancer Institute, Dana-Farber Brigham Cancer Center, Mass General Brigham
- SCShan Chen
Brigham and Women's Hospital, Harvard University, Dana-Farber Cancer Institute, Dana-Farber Brigham Cancer Center, Mass General Brigham
- SASpencer A. Thomas
Brigham and Women's Hospital, Boston Children's Hospital, Harvard University, Dana-Farber Cancer Institute, Dana-Farber Brigham Cancer Center, Mass General Brigham
- TLTafadzwa L. Chaunzwa
Brigham and Women's Hospital, Harvard University, Dana-Farber Cancer Institute, Dana-Farber Brigham Cancer Center, Mass General Brigham
- IFIdalid Franco
Brigham and Women's Hospital, Dana-Farber Cancer Institute, Dana-Farber Brigham Cancer Center
Topics & keywords
- Social determinants of health
- Health care
- Medicine
- Political science
- Law
Funding
- CCConquer Cancer Foundation
- RSRadiological Society of North America
- NONRG Oncology
- VViewRay
- ECEuropean CommissionAward: 866504
- NINational Institutes of HealthAwards: U01CA209414, U24CA194354, NIH-3R01CA240582-01A1S1, R35CA22052, U01CA190234, R01LM013486, U54CA274516
- NCNational Cancer InstituteAwards: U01CA209414, U54CA274516-01A1, U24CA194354, R35CA22052
- UNU.S. National Library of MedicineAward: R01LM013486