Character-Aware Neural Language Models
Harvard University · Courant Institute of Mathematical Sciences
Abstract
We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convolutional neural network (CNN) and a highway network over characters, whose output is given to a long short-term memory (LSTM) recurrent neural network language model (RNN-LM). On the English Penn Treebank the model is on par with the existing state-of-the-art despite having 60% fewer parameters. On languages with rich morphology (Arabic, Czech, French, German, Spanish, Russian), the model outperforms word-level/morpheme-level LSTM baselines, again with fewer parameters. The results suggest that on many languages, character inputs are sufficient for…
Citation impact
- FWCI
- —
- Percentile
- —
- References
- 55
Authors
4Topics & keywords
- Computer science
- Treebank
- Morpheme
- Character (mathematics)
- Natural language processing
- Artificial intelligence
- Language model
- Recurrent neural network
- Quality Education