articleJan 1, 2024GOLD OA
Benchmarking Retrieval-Augmented Generation for Medicine
Indexed incrossref
Abstract
While large language models (LLMs) have achieved state-of-the-art performance on a wide range of medical question answering (QA) tasks, they still face challenges with hallucinations and outdated knowledge.Retrievalaugmented generation (RAG) is a promising solution and has been widely adopted.However, a RAG system can involve multiple flexible components, and there is a lack of best practices regarding the optimal RAG setting for various medical purposes.To systematically evaluate such systems, we propose the Medical Information Retrieval-Augmented Generation Evaluation (MIRAGE), a first-of-its-kind benchmark including 7,663 questions from five medical QA datasets.Using MIRAGE, we conducted large-scale…
Citation impact
197
total citations
- FWCI
- 61.74
- Percentile
- 100%
- References
- 0
Citations per year
Authors
4Topics & keywords
Keywords
- Benchmarking
- Computer science
- Information retrieval
- Artificial intelligence
- Natural language processing
- Business
No related works found for this paper.