Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
Microsoft Research New England (United States) · Microsoft (United States) · +1 more institution
Abstract
The blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. We show that even word embeddings trained on Google News articles exhibit female/male gender stereotypes to a disturbing extent. This raises concerns because their widespread use, as we describe, often tends to amplify these biases. Geometrically, gender bias is first shown to be captured by a direction in the word embedding. Second, gender neutral words are shown to be linearly separable from gender definition words in the word…
Citation impact
- FWCI
- —
- Percentile
- —
- References
- 33
Authors
5- TBTolga BolukbasiCorresponding
- KCKai-Wei Chang
Microsoft Research New England (United States), Microsoft (United States)
- JZJames Zou
Microsoft Research New England (United States), Microsoft (United States)
- VSVenkatesh Saligrama
Microsoft Research New England (United States), Microsoft (United States), Boston University
- ATAdam Tauman Kalai
Microsoft Research New England (United States), Microsoft (United States)
Topics & keywords
- Debiasing
- Word (group theory)
- Computer science
- Analogy
- Programmer
- Embedding
- Gender bias
- Natural language processing
- Quality Education