Multimodal Learning with Deep Boltzmann Machines

Srivastava, Nitish; Salakhutdinov, Ruslan

articleDec 3, 2012Closed access

Multimodal Learning with Deep Boltzmann Machines

NSNitish Srivastava RSRuslan Salakhutdinov

Abstract

A Deep Boltzmann Machine is described for learning a generative model of data that consists of multiple and diverse input modalities. The model can be used to extract a unified representation that fuses modalities together. We find that this representation is useful for classification and information retrieval tasks. The model works by learning a probability density over the space of multimodal inputs. It uses states of latent variables as representations of the input. The model can extract this representation even when some modalities are absent by sampling from the conditional distribution over them and filling them in. Our experimental results on bi-modal data consisting of images and text show that the…

Citation impact

626

total citations

FWCI: 24.97
Percentile: 100%
References: 37

Citations per year

Authors

2

Topics & keywords

Topics

Keywords

Computer science
Boltzmann machine
Artificial intelligence
Modalities
Modality (human–computer interaction)
Generative model
Representation (politics)
Kernel (algebra)

UN Sustainable Development Goals

Quality Education

No related works found for this paper.