The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies
Princeton University · University of California, Berkeley
Abstract
We present the nested Chinese restaurant process (nCRP), a stochastic process that assigns probability distributions to ensembles of infinitely deep, infinitely branching trees. We show how this stochastic process can be used as a prior distribution in a Bayesian nonparametric model of document collections. Specifically, we present an application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Given a corpus of documents, a posterior inference algorithm finds an approximation to a posterior distribution over trees, topics…
Citation impact
- FWCI
- 76.88
- Percentile
- 100%
- References
- 99
Authors
3Topics & keywords
- Computer science
- Bayesian inference
- Inference
- Nonparametric statistics
- Posterior probability
- Cluster analysis
- Tree (set theory)
- Bayesian probability