Enabling Provenance on Large Scale e-Science Applications [chapter]

Miguel Branco, Luc Moreau
2006 Lecture Notes in Computer Science  
Large-scale e-Science experiments present unprecedented data handling requirements with their multi-petabyte data storages. Complex software applications, such as the ATLAS High Energy Physics experiment at CERN, run throughout Grid computing sites around the world in a distributed environment, with scientists performing concurrent analysis on data and producing new data products shared among the collaboration. In this paper, we introduce a multi-phase infrastructure to achieve data provenance
more » ... or an e-Science experiment. We propose an infrastructure to integrate provenance onto an existing legacy application with strong emphasis on scalability and explore the relationship between provenance and metadata introducing a model where data provenance is made available as metadata through a separate reasoning phase.
doi:10.1007/11890850_7 fatcat:klpvdfqnhngotgvsuok4tieio4