Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents

Sun, Weiwei; Yan, Lingyong; Ma, Xinyu; Wang, Shuaiqiang; Ren, Pengjie; Chen, Zhumin; Yin, Dawei; Ren, Zhaochun

doi:10.18653/v1/2023.emnlp-main.923

articleJan 1, 2023GOLD OA

Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents

WSWeiwei Sun LYLingyong Yan XMXinyu Ma SWShuaiqiang Wang PRPengjie Ren

Shandong University · Baidu (China) · +1 more institution

Indexed incrossref

Abstract

Large Language Models (LLMs) have demonstrated remarkable zero-shot generalization across various language-related tasks, including search engines. However, existing work utilizes the generative ability of LLMs for Information Retrieval (IR) rather than direct passage ranking. The discrepancy between the pre-training objectives of LLMs and the ranking objective poses another challenge. In this paper, we first investigate generative LLMs such as ChatGPT and GPT-4 for relevance ranking in IR. Surprisingly, our experiments reveal that properly instructed LLMs can deliver competitive, even superior results to state-of-the-art supervised methods on popular IR benchmarks. Furthermore, to address concerns about data…

Citation impact

184

total citations

FWCI: 30.48
Percentile: 100%
References: 40

Citations per year

Authors

8

Topics & keywords

Topics

Keywords

Ranking (information retrieval)
Computer science
Benchmark (surveying)
Language model
Relevance (law)
Machine learning
Artificial intelligence
Set (abstract data type)

UN Sustainable Development Goals

Quality Education

No related works found for this paper.