"Good things come in small packets": How (inter)national Digital Research Infrastructure can support "Small Data" Humanities and Cultural Heritage research

Daniel O'Donnell
2020 Zenodo  
The purpose of this whitepaper is to describe a largely unrecognised and unsupported but very common research data management (RDM) use-case: that of the traditional "Small Data" Humanities and Cultural Heritage (HCH) research project producing or working with "primary source" research data (i.e. facsimiles and representations of cultural objects as digital text, media, or models). This paper complements the submission from the Canadian Society for Digital Humanities/Société canadienne des
more » ... ités numériques (CSDH/SCHN), which is concerned with the case of research and data in the Digital Humanities, including in such small data contexts, more broadly. As we shall argue in this paper, the kind of traditional data and RDM use-case we are discussing here has gone largely unrecognised by Digital Research Infrastructure (DRI) developers and policy makers — in part because the nature, size, methods of production, and purpose of these datasets are quite different from data production and management in other disciplines, and in part because the data themselves are not always understood as data (or their management as an RDM problem) by the relevant research community (e.g. 1,2). The result is that large quantities of small-project HCH research data are poorly managed and maintained and that often extremely well-curated datasets produced by HCH researchers remain invisible, siloed, or difficult to access by Big Data researchers (where such access is appropriate and ethical); unnecessarily expensive to produce and maintain; and, as a result, in danger of premature loss or obsolescence to researchers and the wider community. The specific tools and techniques required to address these problems already mostly exist within the international DRI ecosystem, though we are aware of no single system or provider that includes them all. We are also aware of no system or provider that specifically supports the workflow, use-case and datatypes we describe here, including such Humanities-focussed projects as Humanities Commo [...]
doi:10.5281/zenodo.4321072 fatcat:leyod3oqfnanxhsoumb2mt5idi