Single-base mismatch profiles for NGS samples [article]

Marco Chierici and Giuseppe Jurman and Marco Roncador and Cesare Furlanello
2011 arXiv   pre-print
Within the preprocessing pipeline of a Next Generation Sequencing sample, its set of Single-Base Mismatches is one of the first outcomes, together with the number of correctly aligned reads. The union of these two sets provides a 4x4 matrix (called Single Base Indicator, SBI in what follows) representing a blueprint of the sample and its preprocessing ingredients such as the sequencer, the alignment software, the pipeline parameters. In this note we show that, under the same technological
more » ... ions, there is a strong relation between the SBI and the biological nature of the sample. To reach this goal we need to introduce a similarity measure between SBIs: we also show how two measures commonly used in machine learning can be of help in this context.
arXiv:1109.1108v1 fatcat:hkqs35artrg7rdnd2vpubd3iee