Learning accurate, compact, and interpretable tree annotation

Petrov, Slav; Barrett, Leon; Thibaux, Romain; Klein, Dan

doi:10.3115/1220175.1220230

articleJan 1, 2006GOLD OA

Learning accurate, compact, and interpretable tree annotation

SPSlav Petrov LBLeon Barrett RTRomain Thibaux DKDan Klein

University of California, Berkeley

Indexed incrossref

Abstract

We present an automatic approach to tree annotation in which basic nonterminal symbols are alternately split and merged to maximize the likelihood of a training treebank. Starting with a simple X-bar grammar, we learn a new grammar whose nonterminals are subsymbols of the original nonterminals. In contrast with previous work, we are able to split various terminals to different degrees, as appropriate to the actual complexity in the data. Our grammars automatically learn the kinds of linguistic distinctions exhibited in previous work on manual tree annotation. On the other hand, our grammars are much more compact and substantially more accurate than previous work on automatic annotation. Despite its simplicity,…

Citation impact

811

total citations

FWCI: 36.35
Percentile: 100%
References: 18

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Treebank
Terminal and nonterminal symbols
Computer science
Annotation
Natural language processing
Artificial intelligence
Tree (set theory)
Rule-based machine translation

UN Sustainable Development Goals

Quality Education

No related works found for this paper.