Accurate unlexicalized parsing

Klein, Dan; Manning, Christopher D.

doi:10.3115/1075096.1075150

articleJan 1, 2003GOLD OA

Accurate unlexicalized parsing

DKDan Klein CDChristopher D. Manning

Stanford University

Indexed incrossref

Abstract

We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar. Indeed, its performance of 86.36% (LP/LR F1) is better than that of early lexicalized PCFG models, and surprisingly close to the current state-of-the-art. This result has potential uses beyond establishing a strong lower bound on the maximum possible accuracy of unlexicalized models: an unlexicalized PCFG is much more compact, easier to replicate, and easier to interpret than more complex lexical models, and the parsing algorithms are simpler, more widely understood,…

Citation impact

3,054

total citations

FWCI: 60.43
Percentile: 100%
References: 20

Citations per year

Authors

2

Topics & keywords

Topics

Keywords

Treebank
Computer science
Parsing
Artificial intelligence
Independence (probability theory)
Grammar
Natural language processing
Simple (philosophy)

UN Sustainable Development Goals

Quality Education

No related works found for this paper.

Funding

NS
National Science Foundation
Award: 0085896