Efficient Learning of Unlabeled Term Trees with Contractible Variables from Positive Data [chapter]

Yusuke Suzuki, Takayoshi Shoudai, Satoshi Matsumoto, Tomoyuki Uchida
2003 Lecture Notes in Computer Science  
In order to represent structural features common to tree structured data, we propose an unlabeled term tree, which is a rooted tree pattern consisting of an unlabeled ordered tree structure and labeled variables. A variable is a labeled hyperedge which can be replaced with any unlabeled ordered tree of size at least 2. In this paper, we deal with a new kind of variable, called a contractible variable, that is an erasing variable which is adjacent to a leaf. A contractible variable can be
more » ... d with any unlabeled ordered tree, including a singleton vertex. Let OTT c be the set of all unlabeled term trees t such that all the labels attaching to the variables of t are mutually distinct. For a term tree t in OTT c , the term tree language L(t) of t is the set of all unlabeled ordered trees which are obtained from t by replacing all variables with unlabeled ordered trees. First we give a polynomial time algorithm for deciding whether or not a given term tree in OTT c matches a given unlabeled ordered tree. Next for a term tree t in OTT c , we define the canonical term tree c(t) of t in OTT c which satisfies L(c(t)) = L(t). And then for two term trees t and t in OTT c , we show that if L(t) = L(t ) then c(t) is isomorphic to c(t ). Using this fact, we give a polynomial time algorithm for finding a minimally generalized term tree in OTT c which explains all given data. Finally we conclude that the class OTT c is polynomial time inductively inferable from positive data.
doi:10.1007/978-3-540-39917-9_23 fatcat:bew7ts5k4net7jdwnugybtsgxa