The Conceptual Integration Modelling Framework: Semantics and Query Answering

Guseva Ekaterina, Université D'Ottawa / University Of Ottawa, Université D'Ottawa / University Of Ottawa
In the context of business intelligence (BI), the accuracy and accessibility of information consolidation play an important role. Integrating data from different sources involves its transformation according to constraints expressed in an appropriate language. The Conceptual Integration Modelling framework (CIM) acts as such a language. The CIM is aimed to allow business users to specify what information is needed in a simplified and comprehensive language. Achieving this requires raising the
more » ... vel of abstraction to the conceptual level, so that users are able to pose queries expressed in a conceptual query language (CQL). The CIM is comprised of three facets: an Extended Entity Relationship (EER) model (a high level conceptual model that is used to design databases), a conceptual schema against which users pose their queries, a relational multidimensional model that represents data sources, and mappings between the conceptual schema and sources. Such mappings can be specified in two ways: in the first scenario, the so-called global-as-view (GAV), the global schema is mapped to views over the relational sources by specifying how to obtain tuples of the global relation from tuples in the sources. In the second scenario, sources may contain less detailed information (a more aggregated data) so the local relations are defined as views over global relations that is called as local-as-view (LAV). In this thesis, we address the problem of expressibility and decidability of queries written in CQL. We first define the semantics of the CIM by translating the conceptual model so we could translate it into a set of first order sentences containing a class of conceptual dependencies (CDs) - tuple-generating dependencies (TGDs) and equality generating dependencies (EGDs), in addition to certain (first order) restrictions to express multidimensionality. Here a multidimensionality means that facts in a data warehouse can be described from different perspectives. The EGDs set the equality between tuples and the TGDs set the rule [...]
doi:10.20381/ruor-3966 fatcat:nzm673nplzbm3m2h4hhsmzxwmq