articleJan 1, 2024GOLD OA

Benchmarking Retrieval-Augmented Generation for Medicine

Indexed incrossref

Abstract

While large language models (LLMs) have achieved state-of-the-art performance on a wide range of medical question answering (QA) tasks, they still face challenges with hallucinations and outdated knowledge.Retrievalaugmented generation (RAG) is a promising solution and has been widely adopted.However, a RAG system can involve multiple flexible components, and there is a lack of best practices regarding the optimal RAG setting for various medical purposes.To systematically evaluate such systems, we propose the Medical Information Retrieval-Augmented Generation Evaluation (MIRAGE), a first-of-its-kind benchmark including 7,663 questions from five medical QA datasets.Using MIRAGE, we conducted large-scale…

Citation impact

197
total citations
FWCI
61.74
Percentile
100%
References
0
Citations per year

Authors

4

Topics & keywords

Keywords
  • Benchmarking
  • Computer science
  • Information retrieval
  • Artificial intelligence
  • Natural language processing
  • Business
No related works found for this paper.

Funding