A survey of Web technology for metadata aggregation in cultural heritage

Nuno Freire, Antoine Isaac, Glen Robson, John Brooks Howard, Hugo Manguinhas, Leslie Chan, Fernando Loizides
2018 Information Services and Use  
In the World Wide Web, a very large number of resources are made available through digital libraries. The existence of many individual digital libraries, maintained by different organizations, brings challenges to the discoverability and usage of these resources by potential users. A widely-used approach is metadata aggregation, where a central organization takes the role of facilitating the discoverability and use of the resources, by collecting their associated metadata. The central
more » ... on has the possibility to further promote the usage of the resources by means that cannot be efficiently undertaken by each digital library in isolation. This paper focuses in the domain of cultural heritage, where OAI-PMH has been the embraced solution, since discovery of resources was only feasible if based on metadata instead of full-text. However, the technological landscape has changed. Nowadays, with the technological improvements accomplished by network communications, computational capacity, and Internet search engines, the motivation for adopting OAI-PMH is not as clear as it used to be. In this paper, we present the results of our analysis of available potential technologies, using as application context the Europeana Network and its requirements for metadata aggregation. We cover the following technologies: This article is published online with Open Access and distributed under the terms of the Creative Commons Attribution Non-Commercial License (CC BY-NC 4.0). 0167-5265/17/$35.00 © 2017 -IOS Press and the authors. 426 N. Freire et al. / A survey of Web technology for metadata aggregation in cultural heritage isolation. This scenario is widely applied in the domain of cultural heritage, where the number of organizations with their own digital libraries is very large. In Europe, Europeana has the role of facilitating the usage of cultural heritage resources from and about Europe, and although many European cultural heritage organizations do not yet have a presence in Europeana, it already holds metadata of resources originating from more than 3,500 providers (source: http://statistics.europeana.eu/europeana [consulted on 4th of January 2017]). This domain is also characterized by users that often have very specific information needs, which cannot be easily fulfilled by the Internet search engines. The retrieval of resources based on metadata, in combination with the hypertext documents of the World Wide Web, has been a challenge that the search engines have not yet been able to provide an effective solution for, therefore the retrieval of cultural heritage resources via search engines is ineffective. The technological approach to metadata aggregation has been mostly based on the OAI-PMH protocol, a technology initially designed in 1999. OAI-PMH was meant to address shortcomings in scholarly communication by providing a technical interoperability solution for discovery of e-prints, via metadata aggregation. The cultural heritage domain embraced the solution offered by OAI-PMH, however, the technological landscape around our domain has changed. Nowadays, cultural heritage organizations are increasingly applying technologies designed for the wider interoperability on the World Wide Web. Particularly relevant for our work are those related with the social web, the web of data, Internet search engine optimization, and the IIIF (International Image Interoperability Framework). In this paper, we present the results of our work in surveying available web technology for applicability in metadata aggregation in cultural heritage. This work is part of our aim to rethink the technological approach for metadata aggregation, with the goal of finding a solution to make the continuous operation of aggregations networks more efficient and to lower the technical barriers for data providers to share their resources. Our work is guided by the study of the existing aggregation network of Europeana, from where we identify the requirements for metadata aggregation. Europeana provides access to digitised cultural resources from a wide range of cultural heritage institutions across Europe, mostly including libraries, museums, archives and galleries. It seeks to enable users to search and access knowledge in all the languages of Europe. This is done either directly, via its web portals, or indirectly, via third-party applications built on top of its data services (search APIs and Linked Open Data). The Europeana service is based on the aggregation and exploitation of (meta)data about the digitized objects from very different contexts. To provide a seamless, efficient services on top of such aggregation, it must solve hard data integration issues. To address these, Europeana has developed infrastructures and workflows for aggregating, ingesting, indexing, normalising, and publishing data. This paper makes the following scientific contributions to the digital libraries community:
doi:10.3233/isu-170859 fatcat:dehdk7ad2rhqhhndmlbgnrtxtu