Multilingual Denoising Pre-training for Neural Machine Translation
Indexed inarxivdatacite
Abstract
This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks. We present mBART -- a sequence-to-sequence denoising auto-encoder pre-trained on large-scale monolingual corpora in many languages using the BART objective. mBART is one of the first methods for pre-training a complete sequence-to-sequence model by denoising full texts in multiple languages, while previous approaches have focused only on the encoder, decoder, or reconstructing parts of the text. Pre-training a complete model allows it to be directly fine tuned for supervised (both sentence-level and document-level) and unsupervised machine translation,…
Citation impact
607
total citations
- FWCI
- —
- Percentile
- —
- References
- 54
Citations per year
Authors
8Topics & keywords
Topics
Keywords
- Machine translation
- Computer science
- Initialization
- Artificial intelligence
- Encoder
- Natural language processing
- Sentence
- BLEU
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.