articleNature MedicineJan 8, 2025HYBRID OA

Toward expert-level medical question answering with large language models

Google (United States) · Stanford University · +2 more institutions

PubMed
Indexed incrossrefpubmed

Abstract

Large language models (LLMs) have shown promise in medical question answering, with Med-PaLM being the first to exceed a 'passing' score in United States Medical Licensing Examination style questions. However, challenges remain in long-form medical question answering and handling real-world workflows. Here, we present Med-PaLM 2, which bridges these gaps with a combination of base LLM improvements, medical domain fine-tuning and new strategies for improving reasoning and grounding through ensemble refinement and chain of retrieval. Med-PaLM 2 scores up to 86.5% on the MedQA dataset, improving upon Med-PaLM by over 19%, and demonstrates dramatic performance increases across MedMCQA, PubMedQA and MMLU clinical…

Citation impact

673
total citations
FWCI
1226.70
Percentile
100%
References
57
Citations per year

Authors

35

Topics & keywords

Keywords
  • Question answering
  • Computer science
  • Natural language processing
  • Medicine
  • Psychology
  • Information retrieval
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.

Funding