articleSep 13, 2005Closed access
Europarl: A Parallel Corpus for Statistical Machine Translation
Abstract
We collected a corpus of parallel text in 11 languages from the proceedings of the European Parliament, which are published on the web 1. This corpus has found widespread use in the NLP community. Here, we focus on its acquisition and its application as training data for statistical machine translation (SMT). We trained SMT systems for 110 language pairs, which reveal interesting clues into the challenges ahead.
Citation impact
3,110
total citations
- FWCI
- 84.51
- Percentile
- 100%
- References
- 9
Citations per year
Authors
1Topics & keywords
Topics
Keywords
- Machine translation
- Computer science
- Natural language processing
- Parallel corpora
- Focus (optics)
- Artificial intelligence
- Example-based machine translation
- Computer-assisted translation
No related works found for this paper.