Automatic generation of mediated schemas through reasoning over data dependencies

Xiang Li, Christoph Quix, David Kensche, Sandra Geisler, Lisong Guo
2011 2011 IEEE 27th International Conference on Data Engineering  
Mediated schemas lie at the center of the well recognized data integration architecture. Classical data integration systems rely on a mediated schema created by human experts through an intensive design process. Automatic generation of mediated schemas is still a goal to be achieved. We generate mediated schemas by merging multiple source schemas interrelated by tuple-generating dependencies (tgds). Schema merging is the process to consolidate multiple schemas into a unified view. The task
more » ... es particularly challenging when the schemas are highly heterogeneous and autonomous. Existing approaches fall short in various aspects, such as restricted expressiveness of input mappings, lacking data level interpretation, the output mapping is not in a logical language (or not given at all), and being confined to binary merging. We present here a novel system which is able to perform native n-ary schema merging using P2P style tgds as input. Suited in the scenario of generating mediated schemas for data integration, the system opts for a minimal schema signature retaining all certain answers of conjunctive queries. Logical output mappings are generated to support the mediated schemas, which enable query answering and, in some cases, query rewriting.
doi:10.1109/icde.2011.5767913 dblp:conf/icde/LiQKGG11 fatcat:kbk7bgb56rcx5k2o6unheo7ice