articleJan 1, 2004GOLD OA

Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics

University of Southern California · Marina Del Rey Hospital

Indexed incrossref

Abstract

In this paper we describe two new objective automatic evaluation methods for machine translation. The first method is based on longest common subsequence between a candidate translation and a set of reference translations. Longest common subsequence takes into account sentence level structure similarity naturally and identifies longest co-occurring in-sequence n-grams automatically. The second method relaxes strict n-gram matching to skip-bigram matching. Skip-bigram is any pair of words in their sentence order. Skip-bigram cooccurrence statistics measure the overlap of skip-bigrams between a candidate translation and a set of reference translations. The empirical results show that both methods correlate with…

Citation impact

731
total citations
FWCI
20.81
Percentile
100%
References
22
Citations per year

Authors

2

Topics & keywords

Keywords
  • Bigram
  • Computer science
  • Machine translation
  • Longest common subsequence problem
  • Artificial intelligence
  • Set (abstract data type)
  • Natural language processing
  • Subsequence
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.