The author-topic model for authors and documents
UC Irvine Health · Stanford University · +1 more institution
Abstract
We introduce the author-topic model, a generative model for documents that extends Latent Dirichlet Allocation (LDA; Blei, Ng, & Jordan, 2003) to include authorship information. Each author is associated with a multinomial distribution over topics and each topic is associated with a multinomial distribution over words. A document with multiple authors is modeled as a distribution over topics that is a mixture of the distributions associated with the authors. We apply the model to a collection of 1,700 NIPS conference papers and 160,000 CiteSeer abstracts. Exact inference is intractable for these datasets and we use Gibbs sampling to estimate the topic and author distributions. We compare the performance with…
Citation impact
- FWCI
- 17.85
- Percentile
- 100%
- References
- 8
Authors
4Topics & keywords
- Latent Dirichlet allocation
- Multinomial distribution
- Topic model
- Computer science
- Inference
- Gibbs sampling
- Dirichlet distribution
- Generative model