articleNov 13, 2004Closed access

Simple BM25 extension to multiple weighted fields

Microsoft (United States) · Microsoft Research (United Kingdom)

Indexed incrossref

Abstract

This paper describes a simple way of adapting the BM25 ranking formula to deal with structured documents. In the past it has been common to compute scores for the individual fields (e.g. title and body) independently and then combine these scores (typically linearly) to arrive at a final score for the document. We highlight how this approach can lead to poor performance by breaking the carefully constructed non-linear saturation of term frequency in the BM25 function. We propose a much more intuitive alternative which weights term frequencies before the nonlinear term frequency saturation function is applied. In this scheme, a structured document with a title weight of two is mapped to an unstructured document…

Citation impact

701
total citations
FWCI
45.24
Percentile
100%
References
20
Citations per year

Authors

3

Topics & keywords

Keywords
  • Extension (predicate logic)
  • Simple (philosophy)
  • Computer science
  • Algorithm
  • Artificial intelligence
  • Programming language
UN Sustainable Development Goals
  • No poverty
No related works found for this paper.