Using Provenance for Quality Assessment and Repair in Linked Open Data

Giorgos Flouris, Yannis Roussakis, María Poveda-Villalón, Pablo N. Mendes, Irini Fundulaki
2012 International Semantic Web Conference  
As the number of data sources publishing their data on the Web of Data is growing, we are experiencing an immense growth of the Linked Open Data cloud. The lack of control on the published sources, which could be untrustworthy or unreliable, along with their dynamic nature that often invalidates links and causes conflicts or other discrepancies, could lead to poor quality data. In order to judge data quality, a number of quality indicators have been proposed, coupled with quality metrics that
more » ... antify the "quality level" of a dataset. In addition to the above, some approaches address how to improve the quality of the datasets through a repair process that focuses on how to correct invalidities caused by constraint violations by either removing or adding triples. In this paper we argue that provenance is a critical factor that should be taken into account during repairs to ensure that the most reliable data is kept. Based on this idea, we propose quality metrics that take into account provenance and evaluate their applicability as repair guidelines in a particular data fusion setting.
dblp:conf/semweb/FlourisRPMF12 fatcat:hhoz3e55bffvle3to2mtsq2uge