Pivoted Document Length Normalization
Indexed incrossref
Abstract
Automatic information retrieval systems have to deal with documents of varying lengths in a text collection. Document length normalization is used to fairly retrieve documents of all lengths. In this study, we ohserve that a normalization scheme that retrieves documents of all lengths with similar chances as their likelihood of relevance will outperform another scheme which retrieves documents with chances very different from their likelihood of relevance. We show that the retrievaf probabilities for a particular normalization method deviate systematically from the relevance probabilities across different collections. We present pivoted normalization, a technique that can be used to modify any normalization…
Citation impact
863
total citations
- FWCI
- 14.98
- Percentile
- 100%
- References
- 18
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Normalization (sociology)
- Computer science
- Relevance (law)
- Byte
- Trigonometric functions
- Algorithm
- Data mining
- Information retrieval
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.