articlearXiv (Cornell University)Jul 21, 2016GREEN OA

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings

Microsoft Research New England (United States) · Microsoft (United States) · +1 more institution

Indexed inarxivdatacite

Abstract

The blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. We show that even word embeddings trained on Google News articles exhibit female/male gender stereotypes to a disturbing extent. This raises concerns because their widespread use, as we describe, often tends to amplify these biases. Geometrically, gender bias is first shown to be captured by a direction in the word embedding. Second, gender neutral words are shown to be linearly separable from gender definition words in the word…

No related works found for this paper.