Feature selection for a rich HPSG grammar using decision trees

Kristina Toutanova, Christopher D. Manning
2002 proceeding of the 6th conference on Natural language learning - COLING-02   unpublished
This paper examines feature selection for log linear models over rich constraint-based grammar (HPSG) representations by building decision trees over features in corresponding probabilistic context free grammars (PCFGs). We show that single decision trees do not make optimal use of the available information; constructed ensembles of decision trees based on different feature subspaces show significant performance gains (14% parse selection error reduction). We compare the performance of the
more » ... ormance of the learned PCFG grammars and log linear models over the same features.
doi:10.3115/1118853.1118883 fatcat:f77yx5txsbadrglqzju2ta3lba