Parsing with Traces: An O(n4) Algorithm and a Structural Representation

Jonathan K. Kummerfeld, Dan Klein
2017 Transactions of the Association for Computational Linguistics  
General treebank analyses are graph structured, but parsers are typically restricted to tree structures for efficiency and modeling reasons. We propose a new representation and algorithm for a class of graph structures that is flexible enough to cover almost all treebank structures, while still admitting efficient learning and inference. In particular, we consider directed, acyclic, one-endpoint-crossing graph structures, which cover most long-distance dislocation, shared argumentation, and
more » ... lar tree-violating linguistic phenomena. We describe how to convert phrase structure parses, including traces, to our new representation in a reversible manner. Our dynamic program uniquely decomposes structures, is sound and complete, and covers 97.3% of the Penn English Treebank. We also implement a proof-of-concept parser that recovers a range of null elements and trace types.
doi:10.1162/tacl_a_00072 fatcat:3cv6ajmhmbbvbmmoz43aguxtzy