A Model of Text for Experimentation in the Social Sciences
University of California, San Diego · Princeton University · +1 more institution
Abstract
Statistical models of text have become increasingly popular in statistics and computer science as a method of exploring large document collections. Social scientists often want to move beyond exploration, to measurement and experimentation, and make inference about social and political processes that drive discourse and content. In this article, we develop a model of text data that supports this type of substantive research. Our approach is to posit a hierarchical mixed membership model for analyzing topical content of documents, in which mixing weights are parameterized by observed covariates. In this model, topical prevalence and topical content are specified as a simple generalized linear model on an…
Citation impact
- FWCI
- 500.20
- Percentile
- 100%
- References
- 146
Authors
3Topics & keywords
- Covariate
- Computer science
- Inference
- Data science
- Parameterized complexity
- Social media
- Statistical inference
- Data collection