articleApplied IntelligenceDec 19, 2014HYBRID OA

Audio-visual speech recognition using deep learning

Waseda University · Kyoto University · +1 more institution

Indexed incrossref

Abstract

Audio-visual speech recognition (AVSR) system is thought to be one of the most promising solutions for reliable speech recognition, particularly when the audio is corrupted by noise. However, cautious selection of sensory features is crucial for attaining high recognition performance. In the machine-learning community, deep learning approaches have recently attracted increasing attention because deep neural networks can effectively extract robust latent features that enable various recognition algorithms to demonstrate revolutionary generalization capabilities under diverse application conditions. This study introduces a connectionist-hidden Markov model (HMM) system for noise-robust AVSR. First, a deep…

Citation impact

583
total citations
FWCI
18.27
Percentile
100%
References
56
Citations per year

Authors

5

Topics & keywords

Keywords
  • Computer science
  • Speech recognition
  • Artificial intelligence
  • Hidden Markov model
  • Convolutional neural network
  • Pattern recognition (psychology)
  • Deep learning
  • Noise (video)
UN Sustainable Development Goals
  • Peace, Justice and strong institutions
No related works found for this paper.

Funding