Unified Language Model Pre-training for Natural Language Understanding and Generation
Microsoft Research (United Kingdom)
Abstract
This paper presents a new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks. The model is pre-trained using three types of language modeling tasks: unidirectional, bidirectional, and sequence-to-sequence prediction. The unified modeling is achieved by employing a shared Transformer network and utilizing specific self-attention masks to control what context the prediction conditions on. UniLM compares favorably with BERT on the GLUE benchmark, and the SQuAD 2.0 and CoQA question answering tasks. Moreover, UniLM achieves new state-of-the-art results on five natural language generation datasets, including improving the CNN/DailyMail…
Citation impact
- FWCI
- —
- Percentile
- —
- References
- 48
Authors
1Topics & keywords
- Computer science
- Automatic summarization
- Natural language generation
- Question answering
- Language model
- Natural language processing
- Artificial intelligence
- Transformer
- Quality Education