articleDec 3, 2012Closed access
Multimodal Learning with Deep Boltzmann Machines
Abstract
A Deep Boltzmann Machine is described for learning a generative model of data that consists of multiple and diverse input modalities. The model can be used to extract a unified representation that fuses modalities together. We find that this representation is useful for classification and information retrieval tasks. The model works by learning a probability density over the space of multimodal inputs. It uses states of latent variables as representations of the input. The model can extract this representation even when some modalities are absent by sampling from the conditional distribution over them and filling them in. Our experimental results on bi-modal data consisting of images and text show that the…
Citation impact
626
total citations
- FWCI
- 24.97
- Percentile
- 100%
- References
- 37
Citations per year
Authors
2Topics & keywords
Topics
Keywords
- Computer science
- Boltzmann machine
- Artificial intelligence
- Modalities
- Modality (human–computer interaction)
- Generative model
- Representation (politics)
- Kernel (algebra)
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.