PerCon: Support for Heterogeneous Data Management and Analysis via Mixed-initiative interaction
Bulletin of IEEE Technical Committee on Digital Libraries
This research proposes a digital library system that allows users to manage a large number of heterogeneous datasets and to support a mixed-initiative interaction for data analysis among the related datasets. e-Science emerged in the areas of physics, earth-science, and bio-informatics, where voluminous datasets are common and the need for infrastructure to manage and share datasets for analysis is more obvious. With the data explosion occurring in many areas, one crucial issue is the
... ity of data sources due to different data platforms and/or environments. Increasing interdisciplinary research and advances in devices, tools, and software for scientific data management generate more and more heterogeneous data. As the scientific community and industry face an increasing amount of diverse and interrelated data, data analysis is becoming more challenging to discover meaningful information and knowledge. Examples of scientific data management and analysis in various domains include digital libraries. However, the digital libraries to date have their uniqueness depending on specific research domains. Beyond a domain-specific data environment, this proposed digital library provides a general data infrastructure/platform applicable to various research fields and supports human-computer interactionbased data analysis. In order to achieve this, three important inquiries are made into the digital library: software requirements and capabilities for heterogeneous data management, a visual workspace environment for translating data into information and knowledge, and mixed-initiative framework for data analysis. A digital library system called PerCon is being developed as a substantial instance of a digital library rather than as a conceptual framework. PerCon is more than a typical digital library, as it integrates data management with data manipulation, presentation, and analysis capabilities. In the long term, the proposed digital library aims to explore the potential for data reuse in the more general field of heterogeneous data management and analysis.