articleJul 30, 2011Closed access

KenLM: Faster and Smaller Language Model Queries

Carnegie Mellon University

Abstract

We present KenLM, a library that implements two data structures for efficient language model queries, reducing both time and memory costs. The PROBING data structure uses linear probing hash tables and is designed for speed. Compared with the widelyused SRILM, our PROBING model is 2.4 times as fast while using 57 % of the memory. The TRIE data structure is a trie with bit-level packing, sorted records, interpolation search, and optional quantization aimed at lower memory consumption. TRIE simultaneously uses less memory than the smallest lossless baseline and less CPU than the fastest baseline. Our code is open-source1, thread-safe, and integrated into the Moses, cdec, and Joshua translation systems. This…

Citation impact

1,146
total citations
FWCI
75.62
Percentile
100%
References
16
Citations per year

Authors

1

Topics & keywords

Keywords
  • Trie
  • Computer science
  • Hash table
  • Data structure
  • Parallel computing
  • Implementation
  • Programming language
  • Theoretical computer science
No related works found for this paper.