Benchmarking Large Language Models in Retrieval-Augmented Generation
Institute of Software · University of Chinese Academy of Sciences · +2 more institutions
Abstract
Retrieval-Augmented Generation (RAG) is a promising approach for mitigating the hallucination of large language models (LLMs). However, existing research lacks rigorous evaluation of the impact of retrieval-augmented generation on different large language models, which make it challenging to identify the potential bottlenecks in the capabilities of RAG for different LLMs. In this paper, we systematically investigate the impact of Retrieval-Augmented Generation on large language models. We analyze the performance of different large language models in 4 fundamental abilities required for RAG, including noise robustness, negative rejection, information integration, and counterfactual robustness. To this end, we…
Citation impact
- FWCI
- 41.90
- Percentile
- 100%
- References
- 50
Authors
4- JCJiawei ChenCorresponding
Institute of Software, University of Chinese Academy of Sciences
- HLHongyu Lin
Chinese Academy of Sciences, Institute of Software
- XHXianpei Han
Chinese Academy of Sciences, Institute of Software, State Key Laboratory of Computer Science
- LSLe Sun
Chinese Academy of Sciences, Institute of Software, State Key Laboratory of Computer Science
Topics & keywords
- Benchmarking
- Computer science
- Natural language processing
- Information retrieval
- Artificial intelligence
- Business
- Quality Education