Head-Driven Statistical Models for Natural Language Parsing
Massachusetts Institute of Technology
Abstract
This article describes three statistical models for natural language parsing. The models extend methods from probabilistic context-free grammars to lexicalized grammars, leading to approaches in which a parse tree is represented as the sequence of decisions corresponding to a head-centered, top-down derivation of the tree. Independence assumptions then lead to parameters that encode the X-bar schema, subcategorization, ordering of complements, placement of adjuncts, bigram lexical dependencies, wh-movement, and preferences for close attachment. All of these preferences are expressed by probabilities conditioned on lexical heads. The models are evaluated on the Penn Wall Street Journal Treebank, showing that…
Citation impact
- FWCI
- 208.94
- Percentile
- 100%
- References
- 63
Authors
1Topics & keywords
- Computer science
- Treebank
- Bigram
- Natural language processing
- Artificial intelligence
- Parsing
- Probabilistic logic
- Natural language understanding
- Quality Education