Continuous generation of versioned collections' members with RML and LDES

Dylan Van Assche, Sitt Min Oo, Julián Andrés Rojas, Pieter Colpaert
2022 Extended Semantic Web Conference  
When evolving datasets are used to generate a knowledge graph, it is usually challenging to keep the graph synchronized in a timely manner when changes occur in the source data. Current approaches fully regenerate a knowledge graph in such cases, which may be time consuming depending on the data type, size, and update frequency. We propose a continuous knowledge graph generation approach that can be applied on different types of data sources. We describe continuously updating knowledge graph
more » ... sions represented as a Linked Data Events Stream, and use an rml processor for rdf generation. In this paper, we present our approach and demonstrate it on different types of data such as bike-sharing, public transport timetables, and weather data. By describing entities with unique, immutable, and reproducible iris, we were able to identify changes in the original data collection, reducing the number of materialized triples and generation time. Our use-cases show the importance of mechanisms to derive unique and stable iri strategies of data source updates, to enable efficient knowledge graph generation pipelines. In the future, we will extend our approach to handle deletions in data collections, and conduct an extensive performance evaluation.
dblp:conf/esws/AsscheORC22 fatcat:szsvu6grtjfxtjfblgtm7cpjx4