Lexicalized Stochastic Modeling of Constraint-Based Grammars using Log-Linear Measures and EM Training [article]

Stefan Riezler, Detlef Prescher, Jonas Kuhn, Mark Johnson
2000 arXiv   pre-print
We present a new approach to stochastic modeling of constraint-based grammars that is based on log-linear models and uses EM for estimation from unannotated data. The techniques are applied to an LFG grammar for German. Evaluation on an exact match task yields 86% precision for an ambiguity rate of 5.4, and 90% precision on a subcat frame match for an ambiguity rate of 25. Experimental comparison to training from a parsebank shows a 10% gain from EM training. Also, a new class-based grammar
more » ... calization is presented, showing a 10% gain over unlexicalized models.
arXiv:cs/0008034v1 fatcat:ty2zl26xxzcbppk35hzomb5pva