Biological data integration: wrapping data and tools
IEEE Transactions on Information Technology in Biomedicine
Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. Building a digital library for scientific data requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the Web as well as data generated by software. We present an approach to
... ping web data sources, databases, flat files, or data generated by tools through a database view mechanism. Generally, a wrapper has two tasks: it first sends a query to the source to retrieve data and, second builds the expected output with respect to the virtual structure. Our wrappers are composed of a retrieval component based on an intermediate object view mechanism called search views mapping the source capabilities to attributes, and an eXtensible Markup Language (XML) engine, respectively, to perform these two tasks. The originality of the approach consists of: 1) a generic view mechanism to access seamlessly data sources with limited capabilities and 2) the ability to wrap data sources as well as the useful specific tools they may provide. Our approach has been developed and demonstrated as part of the multidatabase system supporting queries via uniform object protocol model (OPM) interfaces. Index Terms-Biological data integration, database view, eXtensible Markup Language (XML), mediation, web data sources.