Hierarchical Stochastic Image Grammars for Classification and Segmentation

W. Wang, I. Pollak, T.-S. Wong, C.A. Bouman, M.P. Harper, J.M. Siskind
2006 IEEE Transactions on Image Processing  
We develop a new class of hierarchical stochastic image models called spatial random trees (SRTs) which admit polynomial-complexity exact inference algorithms. Our framework of multitree dictionaries is the starting point for this construction. SRTs are stochastic hidden tree models whose leaves are associated with image data. The states at the tree nodes are random variables, and, in addition, the structure of the tree is random and is generated by a probabilistic grammar. We describe an
more » ... e describe an efficient recursive algorithm for obtaining the maximum a posteriori estimate of both the tree structure and the tree states given an image. We also develop an efficient procedure for performing one iteration of the expectation-maximization algorithm and use it to estimate the model parameters from a set of training images. We address other inference problems arising in applications such as maximization of posterior marginals and hypothesis testing. Our models and algorithms are illustrated through several image classification and segmentation experiments, ranging from the segmentation of synthetic images to the classification of natural photographs and the segmentation of scanned documents. In each case, we show that our method substantially improves accuracy over a variety of existing methods. I. INTRODUCTION. In this work we develop a new methodology for constructing hierarchical stochastic image models called spatial random trees (SRTs). Similar to [2], [10], [13]-[15], [18], [33], [36], [37], [45], [57], [58], our models are stochastic hidden tree models whose leaf nodes are associated with image data. Our key innovation, however, is that not only the states at the nodes of the tree are random variables, but also the tree structure itself is random and is generated by a probabilistic grammar [24], [35], [47] . While grammars have been used to develop tree-structured models for 1D signals [38] , the generalization to 2D is not direct because the leaves of a tree generated by a grammar cannot
doi:10.1109/tip.2006.877496 pmid:17022268 fatcat:h5ixaf3ejzfvtpyp4yifkuoz74