articleThe Annals of Applied StatisticsJun 1, 2007BRONZE OA

A correlated topic model of Science

DMDavid M. BleiJDJohn D. Lafferty
Indexed inarxivcrossref

Abstract

Topic models, such as latent Dirichlet allocation (LDA), can be useful tools for the statistical analysis of document collections and other discrete data. The LDA model assumes that the words of each document arise from a mixture of topics, each of which is a distribution over the vocabulary. A limitation of LDA is the inability to model topic correlation even though, for example, a document about genetics is more likely to also be about disease than X-ray astronomy. This limitation stems from the use of the Dirichlet distribution to model the variability among the topic proportions. In this paper we develop the correlated topic model (CTM), where the topic proportions exhibit correlation via the logistic…

Citation impact

940
total citations
FWCI
15.01
Percentile
100%
References
29
Citations per year

Authors

2
  • DM
    David M. BleiCorresponding
  • JD
    John D. Lafferty

Topics & keywords

Keywords
  • Topic model
  • Latent Dirichlet allocation
  • Inference
  • Dirichlet distribution
  • Set (abstract data type)
  • Statistical inference
  • Data set
  • Statistical model
No related works found for this paper.

Funding