articleACM SIGIR ForumAug 2, 2017Closed access

Pivoted Document Length Normalization

Cornell University

Indexed incrossref

Abstract

Automatic information retrieval systems have to deal with documents of varying lengths in a text collection. Document length normalization is used to fairly retrieve documents of all lengths. In this study, we ohserve that a normalization scheme that retrieves documents of all lengths with similar chances as their likelihood of relevance will outperform another scheme which retrieves documents with chances very different from their likelihood of relevance. We show that the retrievaf probabilities for a particular normalization method deviate systematically from the relevance probabilities across different collections. We present pivoted normalization, a technique that can be used to modify any normalization…

Citation impact

863
total citations
FWCI
14.98
Percentile
100%
References
18
Citations per year

Authors

3

Topics & keywords

Keywords
  • Normalization (sociology)
  • Computer science
  • Relevance (law)
  • Byte
  • Trigonometric functions
  • Algorithm
  • Data mining
  • Information retrieval
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.