Pivoted Document Length Normalization

Singhal, Amit; Buckley, Chris; Mitra, Manclar

doi:10.1145/3130348.3130365

articleACM SIGIR ForumAug 2, 2017Closed access

Pivoted Document Length Normalization

ASAmit Singhal CBChris Buckley MMManclar Mitra

Cornell University

Indexed incrossref

Abstract

Automatic information retrieval systems have to deal with documents of varying lengths in a text collection. Document length normalization is used to fairly retrieve documents of all lengths. In this study, we ohserve that a normalization scheme that retrieves documents of all lengths with similar chances as their likelihood of relevance will outperform another scheme which retrieves documents with chances very different from their likelihood of relevance. We show that the retrievaf probabilities for a particular normalization method deviate systematically from the relevance probabilities across different collections. We present pivoted normalization, a technique that can be used to modify any normalization…

Citation impact

863

total citations

FWCI: 14.98
Percentile: 100%
References: 18

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Normalization (sociology)
Computer science
Relevance (law)
Byte
Trigonometric functions
Algorithm
Data mining
Information retrieval

UN Sustainable Development Goals

Quality Education

No related works found for this paper.