Natural Language Grammar Induction Using a Constituent-Context Model

Dan Klein, Christopher D. Manning
2001 Neural Information Processing Systems  
This paper presents a novel approach to the unsupervised learning of syntactic analyses of natural language text. Most previous work has focused on maximizing likelihood according to generative PCFG models. In contrast, we employ a simpler probabilistic model over trees based directly on constituent identity and linear context, and use an EM-like iterative procedure to induce structure. This method produces much higher quality analyses, giving the best published results on the ATIS dataset.
dblp:conf/nips/KleinM01 fatcat:o5an3ev2wrar3crigaottsudxm