Linked open data alignment & querying

Prateek Jain
2013 ACM SIGWEB Newsletter  
Linked Open Data Alignment & Querying. The recent emergence of the Linked Data approach for publishing data represents a major step forward in realizing the original vision of a web that can "understand and satisfy the requests of people and machines to use the web content" i.e. the Semantic Web. This new approach has resulted in the Linked Open Data (LOD) Cloud, which includes more than 295 large datasets contributed by experts belonging to diverse communities such as geography, entertainment,
more » ... and life sciences. However, the current interlinks between datasets in the LOD Cloud as we will illustrate are too shallow to realize much of the benefits promised. If this limitation is left unaddressed, then the LOD Cloud will merely be more data that suffers from the same kinds of problems, which plague the Web of Documents, and hence the vision of the Semantic Web will fall short. This thesis presents a comprehensive solution to address the issue of alignment and relationship identification using a bootstrapping based approach. By alignment we mean the process of determining correspondences between classes and properties of ontologies. We identify subsumption, equivalence and part-of relationship between classes. The work identifies part-of relationship between instances. Between properties we will establish subsumption and equivalence relationship. By bootstrapping we mean the process of being able to utilize the information which is contained within the datasets for improving the data within them. The work showcases use of bootstrapping based methods to identify and create richer relationships between LOD datasets. The BLOOMS project ( and the PLATO project, both built as part of this research, have provided evidence to the feasibility and the applicability of the solution. Spatio-Temporal-Thematic Queries of Semantic Web Data: a Study of Expressivity and Efciency and NSF Award 1143717 III: EAGER -Expressive Scalable Querying over Linked Open Data. These projects provided the framework within which the experiments described in this dissertation were conducted. Dedicated to My family and friends xii
doi:10.1145/2451836.2451839 fatcat:4nehxwrtqrhvpod3cnvvgokdf4