Enabling rapid cloud-based analysis of thousands of human genomes via Butler [article]

Sergei Yakneen, Sebastian Waszak, Michael Gertz, Jan O. Korbel
2017 bioRxiv   pre-print
We present Butler, a computational framework developed in the context of the international Pan-cancer Analysis of Whole Genomes (PCAWG) project to overcome the challenges of orchestrating analyses of thousands of human genomes on the cloud. Butler operates equally well on public and academic clouds. This highly flexible framework facilitates management of virtual cloud infrastructure, software configuration, genomics workflow development, and provides unique capabilities in workflow execution
more » ... nagement. By comprehensively collecting and analysing metrics and logs, performing anomaly detection as well as notification and cluster self-healing, Butler enables large-scale analytical processing of human genomes with 43% increased throughput compared to prior setups. Butler was key for delivering the germline genetic variant call-sets in 2,834 cancer genomes analysed by PCAWG.
doi:10.1101/185736 fatcat:bdki7pxpv5d6hifil3mh7yrbpm