Testing Our Assumptions: Preliminary Results from the Data Curation Network

Elizabeth Coburn, Lisa Johnston
2020 Journal of eScience Librarianship  
Objective: Data curation is becoming widely accepted as a necessary component of data sharing. Yet, as there are so many different types of data with various curation needs, the Data Curation Network (DCN) project anticipated that a collaborative approach to data curation across a network of repositories would expand what any single institution might offer alone. Now, halfway through a three-year implementation phase, we're testing our assumptions using one year of data from the DCN. Methods:
more » ... the DCN. Methods: Ten institutions participated in the implementation phase of a shared staffing model for curating research data. Starting on January 1, 2019, for 12 months we tracked the number, file types, and disciplines represented in data sets submitted to the DCN. Participating curators were matched to data sets based on their self-reported curation expertise. Aspects such as curation time, level of satisfaction with the assignment, and lack of appropriate expertise in the network were tracked and analyzed. Results: Seventy-four data sets were submitted to the DCN in year one. Seventy-one of them were successfully curated by DCN curators. Each curation assignment takes 2.4 hours on average, and data sets take a median of three days to pass through the network. By analyzing the domain and file types of first- year submissions, we find that our coverage is well represented across domains and that our capacity is higher than the demand, but we also observed that the higher volume of data containing software code relied on certain curator expertise more often than others, creating potential unbalance. Conclusions: The data from year one of the DCN pilot have verified key assumptions about our collaborative approach to data curation, and these results have raised additional questions about capacity, equitable use of network resources, and sustained growth that we hope to answer by the end of this implementation phase.
doi:10.7191/jeslib.2020.1186 fatcat:tsrfc5ozkjcbxca5xinwaqoat4