Multilingual Denoising Pre-training for Neural Machine Translation
Bircham International University · Bansal Institute Of Research Technology & Science · +2 more institutions
Abstract
This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks. We present mBART—a sequence-to-sequence denoising auto-encoder pre-trained on large-scale monolingual corpora in many languages using the BART objective (Lewis et al., 2019 ). mBART is the first method for pre-training a complete sequence-to-sequence model by denoising full texts in multiple languages, whereas previous approaches have focused only on the encoder, decoder, or reconstructing parts of the text. Pre-training a complete model allows it to be directly fine-tuned for supervised (both sentence-level and document-level) and unsupervised machine…
Citation impact
- FWCI
- 77.80
- Percentile
- 100%
- References
- 57
Authors
8- YLYinhan LiuCorresponding
Bircham International University, Bansal Institute Of Research Technology & Science
- JGJiatao GuCorresponding
Meta (Israel), Meta (United States)
- NGNaman GoyalCorresponding
Meta (Israel), Meta (United States)
- XLXian LiCorresponding
Meta (Israel), Meta (United States)
- SESergey EdunovCorresponding
Meta (Israel), Meta (United States)
Topics & keywords
- Computer science
- Machine translation
- Initialization
- Artificial intelligence
- Encoder
- Sentence
- Natural language processing
- BLEU
- Quality Education