A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
On-line String Matching in Highly Similar DNA Sequences
2017
Mathematics in Computer Science
We consider the problem of on-line exact string matching of a pattern in a set of highly similar sequences. This can be useful in cases where indexing the sequences is not feasible. ...
We exhibit experimental results showing that our algorithm is much faster than searching for the pattern in each sequences with a very fast on-line exact string matching algorithm. ...
For FJS algorithm we launch the execution texts one by one. We ran the FJS algorithm on On-line String Matching in Highly Similar DNA Sequences each sequence successively. ...
doi:10.1007/s11786-016-0280-2
fatcat:xc7zvtnx4fey3jkik3phrkex3m
A fast pattern matching algorithm for highly similar sequences
2014
2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
In this paper we propose a very efficient algorithm that solves the on-line exact pattern matching problem in a set of highly similar DNA sequences. ...
There is thus a strong need for efficient algorithms for performing fast pattern matching in such specific sets of sequences. ...
problem on a set of highly similar sequences. ...
doi:10.1109/bibm.2014.6999384
dblp:conf/bibm/NsiraLE14
fatcat:pr7b2dlgizcwbib5kurzf3pjeq
Indexing DNA Sequences Using q-Grams
[chapter]
2005
Lecture Notes in Computer Science
Contributing to the interest, this paper presents a method for indexing the DNA sequences efficiently based on q-grams to facilitate similarity search in a DNA database and sidestep the need for linear ...
Two level index -hash table and c-treesare proposed based on the q-grams of DNA sequences. ...
Conclusion We have devised a novel two-level index structure based on q-grams of the DNA sequences which can support efficient similarity search in DNA sequence database. ...
doi:10.1007/11408079_4
fatcat:sczrqs2aingcxaka5wzavcqp7q
SPARK-MSNA: Efficient algorithm on Apache Spark for aligning multiple similar DNA/RNA sequences with supervised learning
2019
Scientific Reports
Knowledge driven algorithms utilizing features of input sequences, such as high similarity in case of DNA sequences, can help in improving the efficiency of DNA MSA to assist in phylogenetic tree construction ...
The algorithm uses suffix tree for identifying common substrings and uses a modified Needleman-Wunsch algorithm for pairwise alignments. ...
DNA sequences are highly similar compared to protein sequences. ...
doi:10.1038/s41598-019-42966-5
pmid:31036850
pmcid:PMC6488671
fatcat:muqsmksj5bfnrdwwkdqfcveywe
An Improved Fast Search Method Using Histogram Features For Dna Sequence Database
2010
Zenodo
An overlapping processing is newly added to improve the robustness of the algorithm. A large number of DNA sequences with low similarity will be excluded for latter searching. ...
Experimental results using GenBank sequence data show the proposed method combining histogram information and Smith-Waterman algorithm is more efficient for DNA sequence search. ...
In section II, we will first introduce the proposed local search algorithm using histogram features for DNA sequences in detail. ...
doi:10.5281/zenodo.1076388
fatcat:nnti2eptzrahrmyb4i4dzimaoy
A Filtering Algorithm for Efficient Retrieving of DNA Sequence
2009
Journal of clean energy technologies
Index Terms-Exact string matching, Aho-Corasick algorithm, sequence comparison, Smith-Waterman algorithm. ...
The algorithm filtered the expected irrelevant DNA sequences in database from being computed for dynamic programming based optimal alignment process. ...
In general, this ranking process is not guaranteed for highly expected similar DNA sequences to a query are positioned at top of the ranked list. ...
doi:10.7763/ijcte.2009.v1.16
fatcat:hjbt5el2frgaho6yenaukpptdq
Advantages and GPU implementation of high-performance indexed DNA search based on suffix arrays
2011
2011 International Conference on High Performance Computing & Simulation
of DNA sequences is presented. ...
When compared with the CPU, the results demonstrate the possibility to achieve speedups as high as 85 when using the suffix array in the GPU, thus making it an adequate choice for high-performance bioinfomatics ...
The usage of a suffix array for string matching (in this case for DNA sequence alignment) is similar to using any other sorted array to search for a given element. ...
doi:10.1109/hpcsim.2011.5999806
dblp:conf/ieeehpcs/EncarnacaoSR11
fatcat:wbavz72f4fapbfa3qilfdfwb7i
Short Read Alignment Based on Maximal Approximate Match Seeds
2020
Frontiers in Molecular Biosciences
Here, we propose a novel sequence alignment algorithm, named MAM, which can efficiently align short DNA sequences. ...
Thus, most alignment tools prefer to simply discard highly repetitive seeds, but this may cause the true alignment to be missed. ...
Repetitive DNA sequences are multiple copies of sequences with high similarity that occur throughout the genome. ...
doi:10.3389/fmolb.2020.572934
pmid:33251246
pmcid:PMC7674947
fatcat:wopc6aj3i5bytbqgw6asufw6gu
Implementation and performance analysis of efficient index structures for DNA search algorithms in parallel platforms
2012
Concurrency and Computation
performing DNA EFFICIENT INDEX STRUCTURES FOR DNA SEARCH IN PARALLEL PLAT Figure 2. ...
These indexes have been widely adopted to perform DNA search operations of short query sequences against a large reference sequence in general purpose processors. ...
EVALUATION OF INDEX-BASED SEARCH ALGORITHMS To evaluate the conceived highly concurrent implementations of the considered index-based search algorithms, a set of real DNA sequence data obtained from the ...
doi:10.1002/cpe.2970
fatcat:rr5nixxos5f6tdpdxaoe2xefpa
Multiple Co-Evolutionary Networks Are Supported by the Common Tertiary Scaffold of the LacI/GalR Proteins
2013
PLoS ONE
Alternatively, the tertiary scaffold might be adaptable, accommodating a unique set of functionally important sites for each paralogous function. ...
Functionally important positions were identified by conservation and co-evolutionary sequence analyses. ...
Brown (University of California at Santa Barbara) for providing the source code for his implementation of ZNMI. ...
doi:10.1371/journal.pone.0084398
pmid:24391951
pmcid:PMC3877293
fatcat:kqcjkbziojfrvgfcavcnovv3fa
Evolution of biosequence search algorithms: a brief survey
[article]
2018
arXiv
pre-print
The paper surveys the evolution of main algorithmic techniques to compare and search biological sequences. ...
We highlight key algorithmic ideas emerged in response to several interconnected factors: shifts of biological analytical paradigm, advent of new sequencing technologies, and a substantial increase in ...
Acknowledgements Many thanks go to Karel Břinda for his helpful comments and suggestions.
Bibliography ...
arXiv:1808.01038v4
fatcat:uiyjrwvgprgu3nfcu6i47o4wpe
Googling DNA sequences on the World Wide Web
2009
BMC Bioinformatics
We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. ...
We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. ...
Acknowledgements Thanks to Donal Hickey for his advice and help in the development of this algorithm. ...
doi:10.1186/1471-2105-10-s14-s4
pmid:19900300
pmcid:PMC2775150
fatcat:on7fdsthqzcanirzh3xddalwa4
Querying Highly Similar Structured Sequences via Binary Encoding and Word Level Operations
[chapter]
2012
IFIP Advances in Information and Communication Technology
In this paper we present efficient data structures and algorithms for the High Similarity Sequencing Problem. ...
In the High Similarity Sequencing Problem we are given the sequences S0, S1, . . . , S k where Sj = ej 1 Iσ 1 ej 2 Iσ 2 ej 3 Iσ 3 , . . . , ej Iσ and must perform pattern matching on the set of sequences ...
Recent research has focused on exploiting the inherent similarity present in multiple DNA sequences to allow for in memory analysis of a large number of number of DNA sequences. ...
doi:10.1007/978-3-642-33412-2_60
fatcat:h2o2brxqynhvflhpqiucrwn74y
Training-free measures based on algorithmic probability identify high nucleosome occupancy in DNA sequences
2019
Nucleic Acids Research
We introduce and study a set of training-free methods of an information-theoretic and algorithmic complexity nature that we apply to DNA sequences to identify their potential to identify nucleosomal binding ...
We test the measures on well-studied genomic sequences of different sizes drawn from different sources. ...
Species close to each other will have similar DNA sequence entropy values, allowing lossless compression algorithms to compress statistical regularities of genomes of related species with similar compression ...
doi:10.1093/nar/gkz750
pmid:31511887
pmcid:PMC6846163
fatcat:7wydzyy62nf3njwqjauy4oyx2u
Querying highly similar sequences
2013
International Journal of Computational Biology and Drug Design
We present an asymptotically fast O(n + occ log occ)-time algorithm, as well as a practical O( nk w )-time algorithm for solving this problem, where n is the length of a sequence, occ is the number of ...
The Extreme Similarity Sequencing problem consists of finding occurrences of a pattern p in a set S0, S1, . . . , S k of sequences of equal length, where Si, for all 1 ≤ i ≤ k, differs from S0 by a constant ...
These allow for good compression rates for single repetitive sequences; however when there is a large number of sequences which are highly similar the entropy does not change. ...
doi:10.1504/ijcbdd.2013.052206
pmid:23428478
fatcat:35bw3t3lobe6ndd7ojbfpgeexu
« Previous
Showing results 1 — 15 out of 88,033 results