The Risk of Racial Bias in Hate Speech Detection

Sap, Maarten; Card, Dallas; Gabriel, Saadia; Choi, Yejin; Smith, Noah A.

doi:10.18653/v1/p19-1163

articleJan 1, 2019GOLD OA

The Risk of Racial Bias in Hate Speech Detection

MSMaarten Sap DCDallas Card SGSaadia Gabriel YCYejin Choi NANoah A. Smith

Seattle University · University of Washington · +2 more institutions

Indexed incrossref

Abstract

We investigate how annotators' insensitivity to differences in dialect can lead to racial bias in automatic hate speech detection models, potentially amplifying harm against minority populations. We first uncover unexpected correlations between surface markers of African American English (AAE) and ratings of toxicity in several widely-used hate speech datasets. Then, we show that models trained on these corpora acquire and propagate these biases, such that AAE tweets and tweets by self-identified African Americans are up to two times more likely to be labelled as offensive compared to others. Finally, we propose dialect and race priming as ways to reduce the racial bias in annotation, showing that when…

Citation impact

763

total citations

FWCI: 63.98
Percentile: 100%
References: 42

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Offensive
Harm
Computer science
Priming (agriculture)
African american
Annotation
Racial bias
Natural language processing

UN Sustainable Development Goals

Reduced inequalities

No related works found for this paper.

Funding

NS
National Science Foundation
Award: 1714566