A Topic Modeling Comparison Between LDA, NMF, Top2Vec, and BERTopic to Demystify Twitter Posts
Fachhochschule Salzburg · MODUL University Vienna
Abstract
The richness of social media data has opened a new avenue for social science research to gain insights into human behaviors and experiences. In particular, emerging data-driven approaches relying on topic models provide entirely new perspectives on interpreting social phenomena. However, the short, text-heavy, and unstructured nature of social media content often leads to methodological challenges in both data collection and analysis. In order to bridge the developing field of computational science and empirical social research, this study aims to evaluate the performance of four topic modeling techniques; namely latent Dirichlet allocation (LDA), non-negative matrix factorization (NMF), Top2Vec, and BERTopic.…
Citation impact
- FWCI
- 715.42
- Percentile
- 100%
- References
- 64
Authors
2Topics & keywords
- Latent Dirichlet allocation
- Social media
- Topic model
- Data science
- Computer science
- Computational sociology
- Context (archaeology)
- Non-negative matrix factorization