The Turing Way: A Handbook for Reproducible Data Science

The Turing Way Community, Rachael Ainsworth, Becky Arnold, Louise Bowler, Sarah Gibson, Patricia Herterich, Rosie Higman, Anna Krystalli, Alexander Morely, Martin O'Reilly, Kirstie Whitaker
2019 Zenodo  
Poster presentation of the Turing Way at the 2019 Open Science Fair. Abstract: The Turing Way is a handbook to support students, their supervisors, funders and journal editors in ensuring that reproducible data science is "too easy not to do" ( It includes training material on topics such as version control and analysis testing, and will build upon Alan Turing Institute case studies and workshops. The project also demonstrates open and transparent project
more » ... sparent project management and communication with future users, as it is openly developed at our GitHub repository: All resources associated with workshops we have delivered, as well as how to organise a Book Dash (a one-day book sprint), are also openly available. Reproducible research is necessary to ensure that scientific work can be trusted. Funders and publishers are beginning to require that publications include access to the underlying data and the analysis code. The goal is to ensure that all results can be independently verified and built upon in future work, which is sometimes easier said than done. Sharing these research outputs means understanding data management, library sciences, software development, and continuous integration techniques: skills that are not widely taught or expected of academic researchers and data scientists. This poster will present an overview of the handbook so far and show Open Science Fair participants how they can contribute their knowledge to make it even better going forwards or how to open up their own projects to a wider contributor community. This poster relates to the overall theme of the conference, as the Turing Way provides the tools to improve research habits in a self-contained handbook. It will also ensure that PhD students, postdocs, PIs and funding teams know which parts of the "responsibility of reproducibility" they can affect, and what they should do to nudge research and data science to being more efficient, eff [...]
doi:10.5281/zenodo.3381446 fatcat:cler5wecxbdtxpm7l5hwgtae7i