A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
VCFdbR: A method for expressing biobank-scale Variant Call Format data in a SQLite database using R
[article]
2020
bioRxiv
pre-print
As exome and whole-genome sequencing cohorts grow in size, the data they produce strains the limits of current tools and data structures. The Variant Call Format (VCF) was originally created as part of the 1,000 Genomes project. Flexible and concise enough to describe the genetic variations of thousands of samples in a single flat file, the VCF has become the standard for communicating the results of large-scale sequencing experiments. Because of its static and text-based structure, VCFs remain
doi:10.1101/2020.04.28.066894
fatcat:jkgbxlu7f5djraqsfohhp72iue