A new approach to cross-modal multimedia retrieval

Rasiwasia, Nikhil; Pereira, José Costa; Coviello, Emanuele; Doyle, Gabriel; Lanckriet, Gert; Lévy, Roger; Vasconcelos, Nuno

doi:10.1145/1873951.1873987

articleOct 25, 2010Closed access

A new approach to cross-modal multimedia retrieval

NRNikhil Rasiwasia JCJosé Costa Pereira ECEmanuele Coviello GDGabriel Doyle GLGert Lanckriet

University of California, San Diego

Indexed incrossref

Abstract

The problem of joint modeling the text and image components of multimedia documents is studied. The text component is represented as a sample from a hidden topic model, learned with latent Dirichlet allocation, and images are represented as bags of visual (SIFT) features. Two hypotheses are investigated: that 1) there is a benefit to explicitly modeling correlations between the two components, and 2) this modeling is more effective in feature spaces with higher levels of abstraction. Correlations between the two components are learned with canonical correlation analysis. Abstraction is achieved by representing text and images at a more general, semantic level. The two hypotheses are studied in the context of…

Citation impact

1,393

total citations

FWCI: 26.47
Percentile: 100%
References: 44

Citations per year

Authors

7

Topics & keywords

Topics

Keywords

Computer science
Latent Dirichlet allocation
Abstraction
Modal
Information retrieval
Visual Word
Context (archaeology)
Canonical correlation

UN Sustainable Development Goals

Quality Education

No related works found for this paper.