Large Language Models Are Poor Medical Coders — Benchmarking of Medical Code Querying
Icahn School of Medicine at Mount Sinai · Tel Aviv University · +2 more institutions
Indexed incrossref
Abstract
BACKGROUND Large language models (LLMs) have attracted significant interest for automated clinical coding. However, early data show that LLMs are highly error-prone when mapping medical codes. We sought to quantify and benchmark LLM medical code querying errors across several available LLMs.
Citation impact
117
total citations
- FWCI
- 24.69
- Percentile
- 100%
- References
- 14
Citations per year
Authors
8Topics & keywords
Topics
Keywords
- Benchmarking
- Computer science
- Code (set theory)
- Diagnosis code
- Natural language processing
- Programming language
- Medicine
- Business
UN Sustainable Development Goals
- No poverty
No related works found for this paper.