articleNEJM AIApr 19, 2024BRONZE OA

Large Language Models Are Poor Medical Coders — Benchmarking of Medical Code Querying

Icahn School of Medicine at Mount Sinai · Tel Aviv University · +2 more institutions

Indexed incrossref

Abstract

BACKGROUND Large language models (LLMs) have attracted significant interest for automated clinical coding. However, early data show that LLMs are highly error-prone when mapping medical codes. We sought to quantify and benchmark LLM medical code querying errors across several available LLMs.

Citation impact

117
total citations
FWCI
24.69
Percentile
100%
References
14
Citations per year

Authors

8

Topics & keywords

Keywords
  • Benchmarking
  • Computer science
  • Code (set theory)
  • Diagnosis code
  • Natural language processing
  • Programming language
  • Medicine
  • Business
UN Sustainable Development Goals
  • No poverty
No related works found for this paper.

Funding