Formalizing natural‐language spatial relations between linear objects with topological and metric properties

Jun Xu
2007 International Journal of Geographical Information Science  
People usually use qualitative terms to express spatial relations, while current geographic information systems (GISs) all use quantitative approaches to store spatial information. The abilities of current GISs to represent and query spatial information about geographic space are limited. Based on the result of a humansubject test of natural-language descriptions of spatial relations between linear geographic objects, this paper defines a series of quantitative indices that are related to
more » ... l-language spatial relation terms, and uses these indices to formalize the ambiguous natural-language representation with a decision-tree algorithm. The result indicates that using both topological indices and metric indices can formalize the natural-language spatial predicates better than using only topological indices. The rules extracted from the trees are used to characterize the spatial relations into qualitative description groups. Using these rules, a prototype of an intelligent natural-language interface for the ESRI software ArcGIS that can query spatial relations between two linear objects in natural English language is implemented using SNePS (the Semantic Network Processing System). relations have been developed. Region Connection Calculus (RCC) is a regionbased method of representing topological spatial relations (Randell et al. 1992 , Cohn 1996 , Cohn et al. 1997a , 1997b ). An alternative approach to representing and reasoning topological relations is point-set topological spatial relations Franzosa 1991, Egenhofer and Herring 1994) . Directional spatial relations have been described by Frank (1991) and Zhan and Peuquet (1987) . Hernandez (1991) represented the order and orientation of spatial relations in a two-dimensional space qualitatively. Freksa and co-workers introduced an orientation grid for representing qualitative orientation information (Freksa 1992, Freksa and Zimmermann 1992) . Qualitative distance and proximity relations were studied by using fuzzy set membership (Gahegan 1995 , Hernandez et al. 1995 , Yao 2002 . Other methods in qualitative spatial reasoning include algebraic approach (Smith and Park 1992), partially ordered sets (Kainz et al. 1993 ), a computational model for characterizing spatial prepositions (Abella and Kender 1993), approaches combining different information (Bennett et al. 1997 , Clementini et al. 1997 , and a proximity approach for formalizing a region-based theory of space (Vakarelov et al. 2002) . The problem of 'understanding' natural language can be treated as a problem of 'translating' between natural languages and formal languages within a very limited domain (Frank and Mark 1991). To bridge the gap between natural-language terms and a computational model of spatial relations, it is necessary fully to understand the relationship between the ambiguous natural-language representations and the geometric spatial relations of geographic objects, and to formalize the qualitative natural-language terms. designed several human-subject protocols to explore, evaluate or refine computational models of spatial relations in natural language. Human-subject experiments had been conducted to confirm these formal models of spatial relations. Egenhofer (1994, 1995) refined and calibrated the meaning of spatial predicates from English and Spanish concerning line-region relations through crosslinguistic human-subject testing. Based on their study, Shariff et al. (1998) developed a formal model to capture the metric and topological detail of natural-language spatial relations, and implemented the natural-language-like query in a geographic database. Topological relations were identified by the 9intersention model. Two groups of metric details were derived: splitting ratios, which are the normalized values of lengths and areas of intersections; and closeness measures, which are the normalized distances between disjoint object parts. The resulting model of topological and metric properties was calibrated for 64 English-language terms about spatial relations between a line and a region. Recently, Nedas et al. (2006) used topological and metric models to specify the geometry of line-line spatial relations. In this paper, the spatial relationship between two linear objects is studied. A series of topological and metric indices describing the spatial relations of two linear objects are defined. Based on the result of a human-subject test, these indices are used to formalize the natural-language terms about spatial relations of two linear objects using a decision-tree data mining algorithm. Finally, the formalized rules are applied in the ESRI software ArcGIS to fulfil natural-language queries of spatial relations between two linear geographic objects. Natural-language descriptions of spatial relations The human-subject study was conducted to find how people choose words to describe spatial relations between linear objects in different situations. It is a human-computer interactive procedure. A series of maps, with each map showing
doi:10.1080/13658810600894323 fatcat:n6gzgocl4nbnvnh3plwa4pzhoi