ProvenanceWeek 2020: Machine Learning Pipelines: Provenance, Reproducibility and FAIR Data Principles [article]

Sheeba Samuel, Frank Loeffler, Birgitta König-Ries
2020 Figshare  
This presentation is given in ProvenanceWeek2020.In this presentation, we describe our goals and initial steps in supporting the end-to-end reproducibility of ML pipelines. We investigate which factors beyond the availability of source code and datasets influence reproducibility of ML experiments. We propose ways to apply FAIR data practices to ML workflows. We present our preliminary results on the role of our tool, ProvBook, in capturing and comparing provenance of ML experiments and their
more » ... roducibility using Jupyter Notebooks.Paper: https://fusion.cs.uni-jena.de/fusion/wp-content/uploads/2020/06/FAIRMLpipelines.pdfVideo of the talk: https://youtu.be/QVPqg5MGAew
doi:10.6084/m9.figshare.12529634.v1 fatcat:6ls7f7hlqbcuzo27zkzfsdsuhi