Weighted DAG Automata for Semantic Graphs

David Chiang, Frank Drewes, Daniel Gildea, Adam Lopez, Giorgio Satta
2018 Computational Linguistics  
Graphs have a variety of uses in natural language processing, particularly as representations of linguistic meaning. A deficit in this area of research is a formal framework for creating, combining, and using models involving graphs that parallels the frameworks of finite automata for strings and finite tree automata for trees. A possible starting point for such a framework is the formalism of DAG automata, defined by Kamimura and Slutzki and extended by Quernheim and Knight. In this article,
more » ... study the latter in depth, demonstrating several new results, including a practical recognition algorithm that can be used for inference and learning with models defined on DAG automata. We also propose an extension to graphs with unbounded node degree and show that our results carry over to the extended formalism. By and large, these resources are based on, or equivalent to, graphs, in which vertices stand for entities and edges stand for semantic relations among them. The Semantic Dependency Parsing task at SemEval 2014 and 2015 (Oepen et al. 2014 converted several such resources into a unified graph format and invited participants to map from sentences to these semantic graphs. The unification of various kinds of semantic annotation into a single representation, semantic graphs, and the creation of large, broad-coverage collections of these representations are very positive developments for research in semantic processing. What is still missing -in our view -is a formal framework for creating, combining, and using models involving graphs that parallels those for strings and trees. Finite string automata and transducers served as a framework for investigation of speech recognition and computational phonology/morphology. Similarly, context-free grammars (and pushdown automata) served as a framework for investigation of computational syntax and syntactic parsing. But we lack a similar framework for learning and inferring semantic representations. Two such formalisms have recently been proposed for NLP: one is hyperedge replacement graph grammars, or HRGs (Bauderon and Courcelle 1987; Habel and Kreowski 1987; Habel 1992; Drewes, Kreowski, and Habel 1997) , applied to AMR parsing by various authors (Chiang et al. 2013; Peng, Song, and Gildea 2015; Björklund, Drewes, and Ericson 2016) . The other formalism is DAG automata, defined by Kamimura and Slutzki (1981) and extended by Quernheim and Knight (2012) . In this article, we study DAG automata in depth, with the goal of enabling efficient algorithms for natural language processing applications.
doi:10.1162/coli_a_00309 fatcat:sjj5cfnrpng4dfxdrqs3opogem