Query Rewriting Using Views in a Typed Mediator Environment [chapter]

Leonid A. Kalinichenko, Dmitry O. Martynov, Sergey A. Stupnikov
2004 Lecture Notes in Computer Science  
Query rewriting method is proposed for the heterogeneous information integration infrastructure formed by the subject mediator environment. Local as View (LAV) approach treating schemas exported by sources as materialized views over virtual classes of the mediator is considered as the basis for the subject mediation infrastructure. In spite of significant progress of query rewriting with views, it remains unclear how to rewrite queries in the typed, objectoriented mediator environment. This
more » ... r embeds conjunctive views and queries into an advanced canonical object model of the mediator. The "selectionprojection-join" (SPJ) conjunctive query semantics based on type specification calculus is introduced. The paper demonstrates how the existing query rewriting approaches can be extended to be applicable in such typed environment. The paper shows that refinement of the mediator class instance types by the source class instance types is the basic relationship required for establishing query containment in the object environment. ble to various subject domains in science, cultural heritage, mass media, e-commerce, etc. Local as View (LAV) approach [8] treating schemas exported by sources as materialized views over virtual classes of the mediator is considered as the basis for the subject mediation infrastructure. This approach is intended to cope with dynamic, possibly incomplete set of sources. Sources may change their exported schemas, become unavailable from time to time. To disseminate the information sources, their providers register them (concurrently and at any time) at respective subject mediators. A method and tool supporting process of information sources registration at the mediator were presented in [1] . The method is applicable to wide class of source specification models representable in hybrid semi-structured/object canonical mediator model. Ontological specifications are used for identification of mediator classes semantically relevant to a source class. A subset of source information relevant to the mediator classes is discovered based on identification of maximal commonality between a source and mediated level class specification. Such commonality is established so that compositions of mediated class instance types could be refined by a source class instance type. This paper (for the same infrastructure as in [1] ) presents an approach for query rewriting in a typed mediator environment. The problem of rewriting queries using views has recently received significant attention. The data integration systems described in [2,13] follow an approach in which the contents of the sources are described as views over the mediated schema. Algorithms for answering queries using views that were developed specifically for the context of data integration include the Bucket algorithm [13], the inverse-rules algorithm [2, 3, 15] , MiniCon algorithm [14], the resolution-based approach [7], the algorithm for rewriting unions of general conjunctive queries [17] and others. Query rewriting algorithms evolved into conceptually simple and quite efficient constructs producing the maximally-contained rewriting. Most of them have been developed for conjunctive views and queries in the relational, actually typeless data models (Datalog). In spite of significant progress of query rewriting with views, it remains unclear how to rewrite queries in the typed, object-oriented mediator environment. This paper is an attempt to fill in this gap. The paper embeds conjunctive views and queries into an advanced canonical object model of the mediator [9, 11] . The "selection-projection-join" (SPJ) conjunctive query semantics based on type specification calculus [10] is introduced. The paper shows how the existing query rewriting approaches can be extended to be applicable in such object framework. To be specific, the algorithm for rewriting unions of general conjunctive queries [17] has been chosen. The resulting algorithm for the typed environment proposed in the paper exploits the heterogeneous source registration facilities [1] that are based on the refining mapping of the specific source data models into the canonical model of the mediator, resolving ontological differences between mediated and local concepts as well as between structural, behavioral and value conflicts of local and mediated types and classes. Due to the space limit, this paper does not consider various aspects of query rewriting, e.g., such issues as complexity of rewriting, possibility of computing all certain answers to a union query are not discussed: these issues are built on a well known results in the area (e.g., it is known that the inverse-rules algorithm produces
doi:10.1007/978-3-540-30204-9_3 fatcat:czy4bt3nhnhpbeda3ceop7egsi