KenLM: Faster and Smaller Language Model Queries

Heafield, Kenneth

articleJul 30, 2011Closed access

KenLM: Faster and Smaller Language Model Queries

Abstract

We present KenLM, a library that implements two data structures for efficient language model queries, reducing both time and memory costs. The PROBING data structure uses linear probing hash tables and is designed for speed. Compared with the widelyused SRILM, our PROBING model is 2.4 times as fast while using 57 % of the memory. The TRIE data structure is a trie with bit-level packing, sorted records, interpolation search, and optional quantization aimed at lower memory consumption. TRIE simultaneously uses less memory than the smallest lossless baseline and less CPU than the fastest baseline. Our code is open-source1, thread-safe, and integrated into the Moses, cdec, and Joshua translation systems. This…

Citation impact

1,146

total citations

FWCI: 75.62
Percentile: 100%
References: 16

Citations per year

Authors

1

KH
Kenneth HeafieldCorresponding
Carnegie Mellon University

Topics & keywords

Topics

Keywords

Trie
Computer science
Hash table
Data structure
Parallel computing
Implementation
Programming language
Theoretical computer science

No related works found for this paper.