Unified Language Model Pre-training for Natural Language Understanding and Generation

Dong, Li

doi:10.48550/arxiv.1905.03197

preprintarXiv (Cornell University)May 8, 2019GREEN OA

Unified Language Model Pre-training for Natural Language Understanding and Generation

LDLi Dong

Microsoft Research (United Kingdom)

Indexed inarxivdatacite

Abstract

This paper presents a new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks. The model is pre-trained using three types of language modeling tasks: unidirectional, bidirectional, and sequence-to-sequence prediction. The unified modeling is achieved by employing a shared Transformer network and utilizing specific self-attention masks to control what context the prediction conditions on. UniLM compares favorably with BERT on the GLUE benchmark, and the SQuAD 2.0 and CoQA question answering tasks. Moreover, UniLM achieves new state-of-the-art results on five natural language generation datasets, including improving the CNN/DailyMail…

Citation impact

949

total citations

FWCI: —
Percentile: —
References: 48

Citations per year

Authors

1

LD
Li DongCorresponding
Microsoft Research (United Kingdom)

Topics & keywords

Topics

Keywords

Computer science
Automatic summarization
Natural language generation
Question answering
Language model
Natural language processing
Artificial intelligence
Transformer

UN Sustainable Development Goals

Quality Education

No related works found for this paper.