A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Efficient Computation of Sequence Mappability
2022
Algorithmica
In the (k, m)-mappability problem, for a given sequence T of length n, the goal is to compute a table whose ith entry is the number of indices $$j \ne i$$ j ≠ i such that the length-m substrings of T starting ...
We present several efficient algorithms for the general case of the problem. ...
help of a reference sequence. ...
doi:10.1007/s00453-022-00934-y
fatcat:x6frpxpvlnbwtczhvcid2nj6zi
Efficient Computation of Sequence Mappability
[article]
2021
arXiv
pre-print
In the (k,m)-mappability problem, for a given sequence T of length n, the goal is to compute a table whose ith entry is the number of indices j i such that the length-m substrings of T starting at positions ...
Previous works on this problem focused on heuristics computing a rough approximation of the result or on the case of k=1. We present several efficient algorithms for the general case of the problem. ...
A first step of these techniques is to compute the distances between all pairs of sequences representing the set of species or taxa under study. ...
arXiv:1807.11702v3
fatcat:czz74g3soje6dgkqogb5jrsaaq
Efficient Computation of Sequence Mappability
[chapter]
2018
Lecture Notes in Computer Science
Sequence mappability is an important task in genome re-sequencing. ...
In the (k, m)-mappability problem, for a given sequence T of length n, our goal is to compute a table whose ith entry is the number of indices j = i such that length-m substrings of T starting at positions ...
In turn, the process of re-sequencing depends heavily on how mappable a genome is given a set of reads of some fixed length m. ...
doi:10.1007/978-3-030-00479-8_2
fatcat:skkrj4oat5adtj73i6phpwi7n4
Fast Computation and Applications of Genome Mappability
2012
PLoS ONE
We present a fast mapping-based algorithm to compute the mappability of each region of a reference genome up to a specified number of mismatches. ...
Knowing the mappability of a genome is crucial for the interpretation of massively parallel sequencing experiments. ...
Acknowledgments We would like to thank Rachel Harte from the University of California Santa Cruz for her substantial help in the integration of our mappability tracks into the UCSC Genome Browser. ...
doi:10.1371/journal.pone.0030377
pmid:22276185
pmcid:PMC3261895
fatcat:suu3w7k7qjfknetlv2p2exc5da
GenMap: Ultra-fast Computation of Genome Mappability
2020
Bioinformatics
Motivation Computing the uniqueness of k-mers for each position of a genome while allowing for up to e mismatches is computationally challenging. ...
This allows for the computation of marker sequences or finding candidates for probe design by identifying approximate k-mers that are unique to a genome or that are present in all genomes. ...
Acknowledgements The authors acknowledge the support of the de.NBI network for bioinformatics infrastructure, the Intel SeqAn IPCC and the IMPRS for Computational Biology and Scientific Computing. ...
doi:10.1093/bioinformatics/btaa222
pmid:32246826
pmcid:PMC7320602
fatcat:nlhqhjtokzbanimmj5zsnviite
GenMap: Fast and Exact Computation of Genome Mappability
[article]
2019
bioRxiv
pre-print
We present a fast and exact algorithm to compute the (k,e)-mappability. Its inverse, the (k,e)-frequency counts the number of occurrences of each k-mer with up to e errors in a sequence. ...
We also show that mappability can be computed on multiple sequences to identify marker genes illustrated by the example of E. coli strains. ...
Acknowledgements The authors acknowledge the support of the de.NBI network for bioinformatics infrastructure, the Intel SeqAn IPCC and the IMPRS for Computational Biology and Scientific Computing. ...
doi:10.1101/611160
fatcat:h7vr3jtvezaxjpywrblg52sjwm
BS-Seeker2: a versatile aligning pipeline for bisulfite sequencing data
2013
BMC Genomics
BS-Seeker2 improves mappability over existing aligners by using local alignment. It can also map reads from RRBS library by building special indexes with improved efficiency and accuracy. ...
Libraries such as whole genome bisulfite sequencing (WGBS) and reduced represented bisulfite sequencing (RRBS) are widely used for generating DNA methylomes, demanding efficient and versatile tools for ...
Conclusions We provide a BS alignment pipeline, BS-Seeker2, for fast and accurate mapping of BS reads from various types of library. ...
doi:10.1186/1471-2164-14-774
pmid:24206606
pmcid:PMC3840619
fatcat:wdo4csgmozbyvmgia6oy2z67bi
Faster Algorithms for 1-Mappability of a Sequence
[chapter]
2017
Lecture Notes in Computer Science
In the k-mappability problem, we are given a string x of length n and integers m and k, and we are asked to count, for each length-m factor y of x, the number of other factors of length m of x that are ...
We focus here on the version of the problem where k = 1. The fastest known algorithm for k = 1 requires time O(mn log n/ log log n) and space O(n). ...
Another direction of practical interest is thus to devise efficient algorithms for the problems of 1-mappability and k-mappability for the External Memory model of computation. ...
doi:10.1007/978-3-319-71147-8_8
fatcat:hgqbvfm24bep5p67yqx2z5b5xa
Faster algorithms for 1-mappability of a sequence
[article]
2017
arXiv
pre-print
In the k-mappability problem, we are given a string x of length n and integers m and k, and we are asked to count, for each length-m factor y of x, the number of other factors of length m of x that are ...
We focus here on the version of the problem where k = 1. The fastest known algorithm for k = 1 requires time O(mn log n/ log log n) and space O(n). ...
Another direction of practical interest is thus to devise efficient algorithms for the problems of 1-mappability and k-mappability for the External Memory model of computation. ...
arXiv:1705.04022v1
fatcat:xh4iqa7ufvbgzfgofmizebfiie
PICS: Probabilistic Inference for ChIP-seq
2010
Biometrics
In order to improve the computational efficiency of the PICS package, we recommend the utilisation of the parallel package, which allows for easy parallel computations. ...
which stores the sequences of genome locations and associatedd annotations. ...
doi:10.1111/j.1541-0420.2010.01441.x
pmid:20528864
fatcat:lhbuhiji2reupdbopdbhfdfjo4
False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors
2019
F1000Research
Sequence similarity among distinct genomic regions can lead to errors in alignment of short reads from next-generation sequencing. ...
Over 75% of trans-eQTLs using a standard pipeline occurred between regions of sequence similarity and therefore could be due to alignment errors. ...
They also provide software for efficiently computing cross-mappability available at a GitHub link. The command line software has detailed instructions online. ...
doi:10.12688/f1000research.17145.2
fatcat:buzgg7bep5a3tayjhvbxwctpoa
Efficient and Comprehensive Representation of Uniqueness for Next-Generation Sequencing by Minimum Unique Length Analyses
2013
PLoS ONE
We have developed the Minimum Unique Length Tool (MULTo), a framework for efficient and comprehensive representation of mappability information, through identification of the shortest possible length required ...
As next generation sequencing technologies are getting more efficient and less expensive, RNA-Seq is becoming a widely used technique for transcriptome studies. ...
In this paper we present a novel approach to efficiently and comprehensively describe mappability of a genome or transcriptome. ...
doi:10.1371/journal.pone.0053822
pmid:23349747
pmcid:PMC3548888
fatcat:rmjnw25v7zflleib5uxi2bmnga
CLImAT: accurate detection of copy number alteration and loss of heterozygosity in impure and aneuploid tumor samples using whole-genome sequencing data
2014
Computer applications in the biosciences : CABIOS
Therefore, efficient computational methods are required to address these issues. ...
Motivation: Whole-genome sequencing of tumor samples has been demonstrated as an efficient approach for comprehensive analysis of genomic aberrations in cancer genome. ...
Funding: National Natural Science Foundation of China (31100955, 61101061). Conflict of Interest: none declared. ...
doi:10.1093/bioinformatics/btu346
pmid:24845652
pmcid:PMC4155249
fatcat:6v765jmdfzeurifsilmwruah34
MaSC: mappability-sensitive cross-correlation for estimating mean fragment length of single-end short-read sequencing data
2013
Computer applications in the biosciences : CABIOS
We observe that the mappability of different parts of the genome can introduce an artificial bias into cross-correlation computations, resulting in incorrect fragment-length estimates. ...
Motivation: Reliable estimation of the mean fragment length for next-generation short-read sequencing data is an important step in next-generation sequencing analysis pipelines, most notably because of ...
Naı¨ve cross-correlation, on the other hand, simply computes correlation between rows 1 and 4, regardless of mappability more efficient, especially if the lists of reads and mappable intervals are short ...
doi:10.1093/bioinformatics/btt001
pmid:23300135
pmcid:PMC3570216
fatcat:2g6xschejnaurpshsbqkuvr2rm
From Wet-Lab to Variations: Concordance and Speed of Bioinformatics Pipelines for Whole Genome and Whole Exome Sequencing
2016
Human Mutation
Therefore, it is essential to evaluate the robustness of the variant detection process taking into account the computing resources required. ...
We have benchmarked six combinations of state-of-the-art read aligners (BWA-MEM and GEM3) and variant callers (FreeBayes, GATK Haplo-typeCaller, SAMtools) on whole genome and whole exome sequencing data ...
Acknowledgments We thank Raul Tonda for help with pipeline implementation and figure generation, and Nvidia for their donation of part of the systems used in this work. ...
doi:10.1002/humu.23114
pmid:27604516
pmcid:PMC5129537
fatcat:5nsmszxr6bhi5dp6j63oh7m5fe
« Previous
Showing results 1 — 15 out of 3,791 results