Reconciling Equational Heterogeneity Within a Data Federation

Aykut Firat, Stuart E. Madnick, Michael Siegel, Benjamin Grosof, Frank Manola
2009 Social Science Research Network  
Mappings in most federated databases are conceptualized and implemented as black-box transformations between source schemas and a federated schema. This approach does not allow specific mappings to be declared once and reused in other situations. We present an alternative approach, in which data-level mappings are represented independent of source and federated schemas as a network between "contexts". This compendious representation expedites the data federation process via mapping reuse and
more » ... omated mapping composition from simpler mappings. We illustrate the benefits of mapping reuse and composition by using an example that incorporates equational mappings and the application of symbolic equation solving techniques. Wrappers Motivational Example Consider the problem of finding cheap airfares on the Web. The actual example in our prototype system uses eight online airfare sites. For didactical reasons, however, we consider the simplified and slightly dramatized scenario shown in Figure 1 having three sources (an airfare source cheaptickets, and two ancillary sources currencyrates and cityairport) and a single receiver (user) with conflicting assumptions. We also assume that there is a one-to-one mapping between the federated schema and the local schemas to highlight the data-level conflicts. Surprisingly, even in such a simple scenario the semantic differences provide enough complexity to illustrate some of the important issues. Under this scenario, we assume that web sites are wrapped as relational databases [13] , and the users are presented with a relational database interface. Such a scenario is quite realistic using, for instance, IBM's DB2 Information Integrator. One user of the system, whom we will call Ben, is an international student looking for a round trip ticket from Boston to Istanbul, with departure on June 1 st and return on July 1 st 2007. Ben wants to obtain the airfare and airline information for his trip and formulates the following SQL query:
doi:10.2139/ssrn.1477615 fatcat:dituenolajhodebpe5z4wa24ga