articleJan 1, 2009GOLD OA

Labeled LDA

Stanford University

Indexed incrossref

Abstract

A significant portion of the world's text is tagged by readers on social bookmarking websites. Credit attribution is an inherent problem in these corpora because most pages have multiple tags, but the tags do not always apply with equal specificity across the whole document. Solving the credit attribution problem requires associating each word in a document with the most appropriate tags and vice versa. This paper introduces Labeled LDA, a topic model that constrains Latent Dirichlet Allocation by defining a one-to-one correspondence between LDA's latent topics and user tags. This allows Labeled LDA to directly learn word-tag correspondences. We demonstrate Labeled LDA's improved expressiveness over…

Citation impact

1,208
total citations
FWCI
59.31
Percentile
100%
References
15
Citations per year

Authors

4

Topics & keywords

Keywords
  • Latent Dirichlet allocation
  • Computer science
  • Discriminative model
  • Topic model
  • Artificial intelligence
  • Classifier (UML)
  • Natural language processing
  • Baseline (sea)
UN Sustainable Development Goals
  • Reduced inequalities
No related works found for this paper.

Funding