Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets

Sainath, Tara N.; Kingsbury, Brian; Sindhwani, Vikas; Arısoy, Ebru; Ramabhadran, Bhuvana

doi:10.1109/icassp.2013.6638949

articleMay 1, 2013Closed access

Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets

TNTara N. Sainath BKBrian Kingsbury VSVikas Sindhwani EAEbru Arısoy BRBhuvana Ramabhadran

IBM Research - Thomas J. Watson Research Center · IBM (United States)

Indexed incrossref

Abstract

While Deep Neural Networks (DNNs) have achieved tremendous success for large vocabulary continuous speech recognition (LVCSR) tasks, training of these networks is slow. One reason is that DNNs are trained with a large number of training parameters (i.e., 10–50 million). Because networks are trained with a large number of output targets to achieve good performance, the majority of these parameters are in the final weight layer. In this paper, we propose a low-rank matrix factorization of the final weight layer. We apply this low-rank technique to DNNs for both acoustic modeling and language modeling. We show on three different LVCSR tasks ranging between 50–400 hrs, that a low-rank factorization reduces the…

Citation impact

603

total citations

FWCI: 41.54
Percentile: 100%
References: 25

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Computer science
Rank (graph theory)
Vocabulary
Matrix decomposition
Factorization
Artificial neural network
Representation (politics)
Deep neural networks

UN Sustainable Development Goals

Quality Education

No related works found for this paper.