Feeding PIDza to VIVO: data ingest with SPARQL-Generate

Maxime Lefrançois, Sandra Mierz
<span title="2021-06-23">2021</span> <i title="Zenodo"> Zenodo </i> &nbsp;
The first hurdle after installing VIVO is to fill it with an initial set of data about an institution, its researchers and their publications. Done manually it is a cumbersome and time-consuming process. One approach to overcome this is to use open-data containing a persistent identifier(PID) like ROR, ORCID or DOI. The advantage lies in the reduced processing of input data: since data does not need to be disambiguated, the data ingestion process can be reduced to mapping the data to the VIVO
more &raquo; ... tology. While several tools exist that are able to import one PID-identified object into VIVO, the release of Datacite Commons takes this approach to the next level. Datacite Commons offers an interface to a so-called PID-Graph: a structure of multiple connected data objects each identified by a PID. It makes queries possible that take advantage of the connections between several PIDs like e.g. querying an organization (identified by a ROR iD) and its affiliated persons (identified by their ORCID iD) and subsequently their publications (identified by a DOI), and thus providing a quick data basis for an empty Research Information System. In this talk, we will present a microservice importing data from the Datacite Commons PID-Graph and the ROR API into VIVO ( https://github.com/vivo-community/generate2vivo ). This microservice is based on lifting rules defined using the SPARQL-Generate RDF transformation language, which we will overview beforehand. SPARQL-Generate is an expressive template-based language to generate RDF streams or text streams from RDF datasets and document streams in arbitrary formats (for more information see website https://w3id.org/sparql-generate/ )
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5281/zenodo.5027304">doi:10.5281/zenodo.5027304</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/57wwq66vyralxe6q52cflc2iie">fatcat:57wwq66vyralxe6q52cflc2iie</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210629155916/https://zenodo.org/record/5027304/files/VIVO2021-Feeding%20PIDza%20to%20VIVO.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/e8/14/e814e93349e6b5c9072887c955270aa3aee51f18.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5281/zenodo.5027304"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> zenodo.org </button> </a>