Design and implementation of a dataspace model for e-Science applications

Adnan Muslimovic
2008 unpublished
Modern collaborations in science are very often based on large scale linking of databases that were not expected to be used together when they were originally developed. Within the distributed database community, database integration approaches traditionally focus on structural heterogeneity. However, in many scientific applications, there is additionally a strong demand to solve problems of semantic heterogeneity. The heterogeneous and distributed mix of various data sources nowadays requires
more » ... ntelligent management systems in order to provide an unified view over such a data. The research challenge motivating the work on this Thesis is faced by the vision of dataspaces which main idea is to abstract from the underlying data source structures by providing a system managing various and heterogeneous data as single information data source. The dataspace concepts are presented as a vision, however their implementation in e-Science application environments opens new research challenges, especially, in distributed dynamic environments, like scientific grids. The main effort of this work is to provide an integrated view over data being collected in scientific collaborations through e-Science life cycles. These life cycles represent a process of collecting data for significant analysis by introducing a hierarchical and iterative model, which includes several different activities. Each activity contains a number of tasks gathering information from multiple heterogeneous data resources that are organized as participants in scientific dataspaces. The e-Science Life Cycle Dataspace model is presented as ontology specified in the Web Ontology Language, which allows building semantically rich relationships among e-Science life cycle iterations and its participating data elements.
doi:10.25365/thesis.2658 fatcat:qgq4ljarcvefznckozxzjsmrnm