TSSi—an R package for transcription start site identification from 5′ mRNA tag data

C. Kreutz, J. S. Gehring, D. Lang, R. Reski, J. Timmer, S. A. Rensing
2012 Computer applications in the biosciences : CABIOS  
High-throughput sequencing has become an essential experimental approach for the investigation of transcriptional mechanisms. For some applications like ChIP-seq, several approaches for the prediction of peak locations exist. However, these methods are not designed for the identification of transcription start sites (TSSs) because such datasets contain qualitatively different noise. In this application note, the R package TSSi is presented which provides a heuristic framework for the
more » ... ion of TSSs based on 5' mRNA tag data. Probabilistic assumptions for the distribution of the data, i.e. for the observed positions of the mapped reads, as well as for systematic errors, i.e. for reads which map closely but not exactly to a real TSS, are made and can be adapted by the user. The framework also comprises a regularization procedure which can be applied as a preprocessing step to decrease the noise and thereby reduce the number of false predictions. Availability: The R package TSSi is available from the Bioconductor web site
doi:10.1093/bioinformatics/bts189 pmid:22513994 fatcat:xrtefpedjzeqreqfojf3wzr4pa