An approach to quantify integration quality using feedback on mapping results

Fernando R.S. Serrano, Alvaro A.A. Fernandes, Klitos Christodoulou
2019 International Journal of Web Information Systems  
Purpose -The pay-as-you-go approach to data integration aims to reduce the time and effort required by proposing a bootstrap phase in which algorithms, rather than experts, identify semantic correspondences and generate the mappings. This highly automated bootstrap phase is likely to be of low quality, thus pay-as-yougo approaches postulate a subsequent continuous improvement phase based on user feedback assimilation to improve the quality of the integration. The purpose of this paper is to
more » ... tify the quality of a speculative integration, using one particular type of feedback, mapping results, whilst taking into account the uncertainty of user feedback provided. Design/methodology/approach -The authors propose a systematic approach to quantify the quality of an integration as a conditional probability given the trustworthiness of the workers. Given a set of mappings and a set of workers of unknown trustworthiness, feedback instances are collected in the extents of the mappings that characterize the integration. Taking into account the available evidence obtained from worker feedback, the technique provides a quality quantification of the speculative integration. Findings -Experimental results on both synthetic and real-world scenarios provide valuable empirical evidence that the technique produces a cost-effective quantification of integration quality that faithfully reflects the judgement of the workers whilst taking into account the inherent uncertainty of user feedback. Originality/value -Current pay-as-you-go techniques provide a limited view of the integration quality as the result of feedback assimilation. To the best of the authors' knowledge, this is the first proposal for quantifying integration quality in a systematic and principled manner using mapping results as a piece of evidence while at the same time considering the uncertainty inherited from user feedback.
doi:10.1108/ijwis-05-2018-0043 fatcat:a2obrsenvve65ntaojtz5p5emy