A Method of Automated Nonparametric Content Analysis for Social Science

Hopkins, Daniel J.; King, Gary

doi:10.1111/j.1540-5907.2009.00428.x

articleAmerican Journal of Political ScienceDec 28, 2009GREEN OA

A Method of Automated Nonparametric Content Analysis for Social Science

DJDaniel J. Hopkins GKGary King

Georgetown University · Harvard University Press

Indexed incrossref

Abstract

The increasing availability of digitized text presents enormous opportunities for social scientists. Yet hand coding many blogs, speeches, government records, newspapers, or other sources of unstructured text is infeasible. Although computer scientists have methods for automated content analysis, most are optimized to classify individual documents, whereas social scientists instead want generalizations about the population of documents, such as the proportion in a given category. Unfortunately, even a method with a high percent of individual documents correctly classified can be hugely biased when estimating category proportions. By directly optimizing for this social science goal, we develop a method that…

Citation impact

775

total citations

FWCI: 29.43
Percentile: 100%
References: 69

Citations per year

Authors

2

Topics & keywords

Topics

Keywords

Computer science
Newspaper
Coding (social sciences)
Content analysis
Classifier (UML)
Data science
Nonparametric statistics
Population

UN Sustainable Development Goals

Quality Education

No related works found for this paper.