Assessing Quantity and Quality of Links Between Link Data Datasets

Ciro Baron Neto, Dimitris Kontokostas, Sebastian Hellmann, Kay Müller, Martin Brümmer
2016 The Web Conference  
The Linked Data Web is growing and it becomes increasingly necessary to analyze the relationship between datasets to exploit its full value. LOD datasets can range from datasets with low cohesion -containing data from different Fully Qualified Domain Names (FQDN) and namespaces -to highly cohesive datasets. This paper evaluates the quantity and quality of links between distributions, datasets and ontologies categorizing and defining different types of links. We streamed and indexed 2.5 billion
more » ... riples and extracted 0.5 billion links using probabilistic data structures. Our results show the analysis of datasets w.r.t. valid links, dead links, and number of namespaces described by distributions and datasets. Our results indicate that 7.9% of the links we indexed and verified are actually dead.
dblp:conf/www/NetoKHMB16 fatcat:rrsbc6tmvbhzbeqckjie6bnxxi