Multithreaded variant calling in elPrep 5 [article]

Charlotte Herzeel, Pascal Costanza, Dries Decap, Jan Fostier, Roel Wuyts, Wilfried Verachtert
2020 bioRxiv   pre-print
We present elPrep 5, which updates the elPrep framework for processing sequencing alignment/map files with variant calling. elPrep 5 can now execute the full pipeline described by the GATK Best Practices for variant calling, which consists of PCR and optical duplicate marking, sorting by coordinate order, base quality score recalibration, and variant calling using the haplotype caller algorithm. elPrep 5 produces identical BAM and VCF output as GATK 4 while significantly reducing the runtime by
more » ... parallelizing and merging the execution of the pipeline steps. Our benchmarks show that elPrep 5 speeds up the runtime of the variant calling pipeline by a factor 8-16x on both whole-exome and whole-genome data while using the same hardware resources as GATK 4. This makes elPrep 5 a suitable drop-in replacement for GATK 4 when faster execution times are needed.
doi:10.1101/2020.12.11.421073 fatcat:qobny74o5vftfd2jrb3p5ipei4