Reproducibility Analysis of Scientific Workflows

2017 Acta Polytechnica Hungarica  
Scientific workflows are efficient tools for specifying and automating compute and data intensive in-silico experiments. An important challenge related to their usage is their reproducibility. In order to make it reproducible, many factors have to be investigated which can influence and even prevent this process: the missing descriptions and samples; the missing provenance data about the environmental parameters and the data dependencies; the dependencies of executions which are based on
more » ... hardware, changing or volatile third party services or random generated values. Some of these factors (called dependencies) can be eliminated by careful design or by huge resource usage but most of them cannot be bypassed. Our investigation deals with the critical dependencies of execution. In this paper we set up a mathematical model to evaluate the results of the workflow in addition we provide a mechanism to make the workflow reproducible based on provenance data and statistical tools.
doi:10.12700/aph.14.2.2017.2.11 fatcat:r5ffdhjcnrbk3m7dwyccxdyh2i