Toward expert-level medical question answering with large language models

Singhal, K. K.; Tu, Tao; Gottweis, Juraj; Sayres, Rory; Wulczyn, Ellery; Amin, Mohamed; Hou, Le; Clark, Kevin; Pfohl, Stephen; Cole-Lewis, Heather; Neal, Darlene; Rashid, Qazi Mamunur; Schaekermann, Mike; Wang, Amy; Dash, Dev; Chen, Jonathan H.; Shah, Nigam H.; Lachgar, Sami; Mansfield, P.; Prakash, Sushant; Green, Bradley; Dominowska, Ewa; Arcas, Blaise Agüera y; Tomašev, Nenad; Liu, Yun; Wong, Renee; Semturs, Christopher; Mahdavi, S. Sara; Barral, Joëlle; Webster, Dale R.; Corrado, Greg S.; Matias, Yossi; Azizi, Shekoofeh; Karthikesalingam, Alan; Natarajan, Vivek

doi:10.1038/s41591-024-03423-7

articleNature MedicineJan 8, 2025HYBRID OA

Toward expert-level medical question answering with large language models

KKK. K. Singhal TTTao Tu JGJuraj Gottweis RSRory Sayres EWEllery Wulczyn

Google (United States) · Stanford University · +2 more institutions

PubMed

Indexed incrossrefpubmed

Abstract

Large language models (LLMs) have shown promise in medical question answering, with Med-PaLM being the first to exceed a 'passing' score in United States Medical Licensing Examination style questions. However, challenges remain in long-form medical question answering and handling real-world workflows. Here, we present Med-PaLM 2, which bridges these gaps with a combination of base LLM improvements, medical domain fine-tuning and new strategies for improving reasoning and grounding through ensemble refinement and chain of retrieval. Med-PaLM 2 scores up to 86.5% on the MedQA dataset, improving upon Med-PaLM by over 19%, and demonstrates dramatic performance increases across MedMCQA, PubMedQA and MMLU clinical…

Citation impact

673

total citations

FWCI: 1226.70
Percentile: 100%
References: 57

Citations per year

Authors

35

Topics & keywords

Topics

Keywords

Question answering
Computer science
Natural language processing
Medicine
Psychology
Information retrieval

UN Sustainable Development Goals

Quality Education

No related works found for this paper.