LLM-assisted systematic review of large language models in clinical medicine
Duke University · Washington University in St. Louis · +10 more institutions
Abstract
Clinical evaluations of large language models (LLMs) have rapidly expanded since 2022, yet their evidence base remains opaque. The overwhelming volume of studies creates challenges for manual curation and review. However, LLMs themselves offer the scalability and capability to evaluate the ever-growing evidence base. This LLM-assisted review identified 4,609 peer-reviewed studies in clinical medicine between January 2022 and September 2025, equating to roughly 3.2 papers per day. Only 1,048 studies used real-world patient data and of these only 19 were prospective randomized trials; most addressed simulated scenarios (n = 1,857) or exam-style tasks (n = 1,704). ChatGPT and related OpenAI models constitute…
Citation impact
- FWCI
- 95.93
- Percentile
- 100%
- References
- 17
Authors
12- SFSully F. ChenCorresponding
Duke University
- AAAnton Alyakin
Washington University in St. Louis, NYU Langone Health, Médecins Sans Frontières
- ASAndreas Seas
Duke University
- EYEunice Yang
NYU Langone Health, Columbia University
- JCJinhyuk Choi
University of Georgia, NYU Langone Health, Augusta University Health
Topics & keywords
- MEDLINE
- Task (project management)
- Clinical trial
- Alternative medicine
- Equating
- Evidence-based medicine
- Randomized controlled trial
- Scale (ratio)
- Quality Education