MPNet: Masked and Permuted Pre-training for Language Understanding

Song, Kaitao; Tan, Xu; Qin, Tao; Lu, Jianfeng; Liu, Tie‐Yan

doi:10.48550/arxiv.2004.09297

preprintarXiv (Cornell University)Apr 20, 2020GREEN OA

MPNet: Masked and Permuted Pre-training for Language Understanding

KSKaitao Song XTXu Tan TQTao Qin JLJianfeng Lu TLTie‐Yan Liu

Nanjing University of Science and Technology · Microsoft Research (United Kingdom)

Indexed inarxivdatacite

Abstract

BERT adopts masked language modeling (MLM) for pre-training and is one of the most successful pre-training models. Since BERT neglects dependency among predicted tokens, XLNet introduces permuted language modeling (PLM) for pre-training to address this problem. However, XLNet does not leverage the full position information of a sentence and thus suffers from position discrepancy between pre-training and fine-tuning. In this paper, we propose MPNet, a novel pre-training method that inherits the advantages of BERT and XLNet and avoids their limitations. MPNet leverages the dependency among predicted tokens through permuted language modeling (vs. MLM in BERT), and takes auxiliary position information as input to…

Citation impact

506

total citations

FWCI: —
Percentile: —
References: 26

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Computer science
Leverage (statistics)
Language model
Dependency (UML)
Sentence
Margin (machine learning)
Artificial intelligence
Natural language processing

UN Sustainable Development Goals

Quality Education

No related works found for this paper.