Using Convolutional Neural Networks to Classify Hate-Speech
Norwegian University of Science and Technology
Abstract
The paper introduces a deep learningbased Twitter hate-speech text classification system. The classifier assigns each tweet to one of four predefined categories: racism, sexism, both (racism and sexism) and non-hate-speech. Four Convolutional Neural Network models were trained on resp. character 4-grams, word vectors based on semantic information built using word2vec, randomly generated word vectors, and word vectors combined with character n-grams. The feature set was down-sized in the networks by maxpooling, and a softmax function used to classify tweets. Tested by 10-fold crossvalidation, the model based on word2vec embeddings performed best, with higher precision than recall, and a 78.3% F-score.
Citation impact
- FWCI
- 43.72
- Percentile
- 100%
- References
- 22
Authors
2Topics & keywords
- Word2vec
- Softmax function
- Computer science
- Artificial intelligence
- Convolutional neural network
- Pooling
- Word (group theory)
- Natural language processing
- Gender equality