Transformations for the compression of FASTQ quality scores of next-generation sequencing data

Raymond Wan, Vo Ngoc Anh, Kiyoshi Asai
2011 Computer applications in the biosciences : CABIOS  
Motivation: The growth of next-generation sequencing means that more effective and efficient archiving methods are needed to store the generated data for public dissemination and in anticipation of more mature analytical methods later. This article examines methods for compressing the quality score component of the data to partly address this problem. Results: We compare several compression policies for quality scores, in terms of both compression effectiveness and overall efficiency. The
more » ... es employ lossy and lossless transformations with one of several coding schemes. Experiments show that both lossy and lossless transformations are useful, and that simple coding methods, which consume less computing resources, are highly competitive, especially when random access to reads is needed.
doi:10.1093/bioinformatics/btr689 pmid:22171329 fatcat:7k2vg35kijb7baeea7234j4vim