Kaldi Speech Recognition Toolkit

Povey, Daniel

doi:10.57702/jb3fvbn9

articleInfoscience (Ecole Polytechnique Fédérale de Lausanne)Jan 1, 2024GREEN OA

Kaldi Speech Recognition Toolkit

DPDaniel Povey

Microsoft (United States)

Indexed indatacite

Abstract

Abstract—We describe the design of Kaldi, a free, open-source toolkit for speech recognition research. Kaldi provides a speech recognition system based on finite-state transducers (using the freely available OpenFst), together with detailed documentation and scripts for building complete recognition systems. Kaldi is written is C++, and the core library supports modeling of arbitrary phonetic-context sizes, acoustic modeling with subspace Gaussian mixture models (SGMM) as well as standard Gaussian mixture models, together with all commonly used linear and affine transforms. Kaldi is released under the Apache License v2.0, which is highly nonrestrictive, making it suitable for a wide community of users. I.

Citation impact

4,898

total citations

FWCI: —
Percentile: —
References: 15

Citations per year

Authors

1

DP
Daniel PoveyCorresponding
Microsoft (United States)

Topics & keywords

Topics

Keywords

Computer science
Context (archaeology)
Scripting language
Speech recognition
Set (abstract data type)
Affine transformation
Subspace topology
Hidden Markov model

No related works found for this paper.