If these crawls could talk: Studying and documenting web archives provenance

Emily Maemura, Nicholas Worby, Ian Milligan, Christoph Becker
2018 Journal of the Association for Information Science and Technology  
The increasing use and prominence of web archives raises the urgency of establishing mechanisms for transparency in the making of web archives to facilitate the process of evaluating a web archive's provenance, scoping, and absences. Some choices and process events are captured automatically, but their interactions are not currently well understood or documented. This study examines the decision space of web archives and its role in shaping what is and what is not captured in the web archiving
more » ... rocess. By comparing how three different web archives collections were created and documented, we investigate how curatorial decisions interact with technical and external factors and we compare commonalities and differences. The findings reveal the need to understand both the social and technical context that shapes those decisions and the ways in which these individual decisions interact. Based on the study, we propose a framework for documenting key dimensions of a collection that addresses the situated nature of the organizational context, technical specificities, and unique characteristics of web materials that are the focus of a collection. The framework enables future researchers to undertake empirical work studying the process of creating web archives collections in different contexts.
doi:10.1002/asi.24048 fatcat:d3fdgdqapnej7isd4zpa4ybdx4