articleJan 1, 2023GOLD OA

Precise Zero-Shot Dense Retrieval without Relevance Labels

Carnegie Mellon University · University of Waterloo

Indexed incrossref

Abstract

While dense retrieval has been shown to be effective and efficient across tasks and languages, it remains difficult to create effective fully zero-shot dense retrieval systems when no relevance labels are available. In this paper, we recognize the difficulty of zero-shot learning and encoding relevance. Instead, we propose to pivot through Hypothetical Document Embeddings (HyDE). Given a query, HyDE first zero-shot prompts an instruction-following language model (e.g., InstructGPT) to generate a hypothetical document. The document captures relevance patterns but is "fake" and may contain hallucinations. Then, an unsupervised contrastively learned encoder (e.g., Contriever) encodes the document into an…

Citation impact

220
total citations
FWCI
36.37
Percentile
100%
References
53
Citations per year

Authors

4

Topics & keywords

Keywords
  • Relevance (law)
  • Computer science
  • Similarity (geometry)
  • Relevance feedback
  • Embedding
  • Vector space model
  • Artificial intelligence
  • Information retrieval
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.

Funding