articleNov 19, 2002Closed access
Improved backing-off for M-gram language modeling
Philips (Finland) · Philips (Germany) · +1 more institution
Indexed incrossref
Abstract
In stochastic language modeling, backing-off is a widely used method to cope with the sparse data problem. In case of unseen events this method backs off to a less specific distribution. In this paper we propose to use distributions which are especially optimized for the task of backing-off. Two different theoretical derivations lead to distributions which are quite different from the probability distributions that are usually used for backing-off. Experiments show an improvement of about 10% in terms of perplexity and 5% in terms of word error rate.
Citation impact
1,493
total citations
- FWCI
- 26.01
- Percentile
- 100%
- References
- 10
Citations per year
Authors
2Topics & keywords
Topics
Keywords
- Perplexity
- Language model
- Computer science
- Probability distribution
- Task (project management)
- n-gram
- Word error rate
- Word (group theory)
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.