articleJul 28, 2003Closed access

Modeling annotated data

University of California, Berkeley

Indexed incrossref

Abstract

We consider the problem of modeling annotated data---data with multiple types where the instance of one type (such as a caption) serves as a description of the other type (such as an image). We describe three hierarchical probabilistic mixture models which aim to describe such data, culminating in correspondence latent Dirichlet allocation, a latent variable model that is effective at modeling the joint distribution of both types and the conditional distribution of the annotation given the primary type. We conduct experiments on the Corel database of images and captions, assessing performance in terms of held-out likelihood, automatic annotation, and text-based image retrieval.

Citation impact

1,069
total citations
FWCI
54.81
Percentile
100%
References
18
Citations per year

Authors

2

Topics & keywords

Keywords
  • Latent Dirichlet allocation
  • Computer science
  • Annotation
  • Probabilistic logic
  • Latent variable
  • Artificial intelligence
  • Topic model
  • Data modeling
No related works found for this paper.

Funding