Large language models for data extraction from unstructured and semi-structured electronic health records: a multiple model performance evaluation
Triemli Hospital · University Hospital of Zurich
Abstract
We aimed to evaluate the performance of multiple large language models (LLMs) in data extraction from unstructured and semi-structured electronic health records.
50 synthetic medical notes in English, containing a structured and an unstructured part, were drafted and evaluated by domain experts, and subsequently used for LLM-prompting. 18 LLMs were evaluated against a baseline transformer-based model. Performance assessment comprised four entity extraction and five binary classification tasks with a total of 450 predictions for each LLM. LLM-response consistency assessment was performed over three same-prompt iterations.
Citation impact
- FWCI
- 94.57
- Percentile
- 100%
- References
- 19
Authors
8- VNVasileios NtinopoulosCorresponding
Triemli Hospital, University Hospital of Zurich
- HRHéctor Rodríguez Cetina Biefer
Triemli Hospital, University Hospital of Zurich
- ITI. Tudorache
Triemli Hospital, University Hospital of Zurich
- NPNestoras Papadopoulos
Triemli Hospital, University Hospital of Zurich
- DODragan Odavic
Triemli Hospital, University Hospital of Zurich
Topics & keywords
- Unstructured data
- Health records
- Computer science
- Language model
- Data extraction
- Natural language processing
- Data mining
- Data science