Automatic grammar induction and parsing free text

Eric Brill
1993 Proceedings of the workshop on Human Language Technology - HLT '93   unpublished
In this paper we describe a new technique for parsing free text: a transformational grammar I is automatically learned that is capable of accurately parsing text into binary-branching syntactic trees with nonterminals unlabelled. The algorithm works by beginning in a very naive state of knowledge about phrase structure. By repeatedly comparing the results of bracketing in the current state to proper bracketing provided in the training corpus, the system learns a set of simple structural
more » ... mations that can be applied to reduce error. After describing the algorithm, we present results and compare these results to other recent results in automatic grammar induction.
doi:10.3115/1075671.1075726 fatcat:aiwky3drhng4hmb7w5wa3sob3y