The Pushshift Reddit Dataset
Max Planck Institute for Informatics · University of Colorado Boulder · +3 more institutions
Abstract
Social media data has become crucial to the advancement of scientific understanding. However, even though it has become ubiquitous, just collecting large-scale social media data involves a high degree of engineering skill set and computational resources. In fact, research is often times gated by data engineering problems that must be overcome before analysis can proceed. This has resulted recognition of datasets as meaningful research contributions in and of themselves.Reddit, the so called “front page of the Internet,” in particular has been the subject of numerous scientific studies. Although Reddit is relatively open to data acquisition compared to social media platforms like Facebook and Twitter, the…
Citation impact
- FWCI
- 138.68
- Percentile
- 100%
- References
- 107
Authors
5Topics & keywords
- Social media
- Computer science
- Data science
- World Wide Web
- The Internet
- Set (abstract data type)
- Subject (documents)