Toward expert-level medical question answering with large language models
Google (United States) · Stanford University · +2 more institutions
Abstract
Large language models (LLMs) have shown promise in medical question answering, with Med-PaLM being the first to exceed a 'passing' score in United States Medical Licensing Examination style questions. However, challenges remain in long-form medical question answering and handling real-world workflows. Here, we present Med-PaLM 2, which bridges these gaps with a combination of base LLM improvements, medical domain fine-tuning and new strategies for improving reasoning and grounding through ensemble refinement and chain of retrieval. Med-PaLM 2 scores up to 86.5% on the MedQA dataset, improving upon Med-PaLM by over 19%, and demonstrates dramatic performance increases across MedMCQA, PubMedQA and MMLU clinical…
Citation impact
- FWCI
- 1226.70
- Percentile
- 100%
- References
- 57
Authors
35Topics & keywords
- Question answering
- Computer science
- Natural language processing
- Medicine
- Psychology
- Information retrieval
- Quality Education