Filters








222,222 Hits in 5.3 sec

Subject Index

2007 Journal of Discrete Algorithms  
works, 408 Repeat Linear time algorithm for the longest common repeat problem, 243 Reversal distance A linear time algorithm for the inversion median problem in circular bacterial genomes, 637 Ring  ...  in wireless networks, 395 Linear time algorithm for the longest common repeat problem, 243; An upper bound on the hardness of exact matrix based motif discovery, 706 String matching Parameterized matching  ... 
doi:10.1016/s1570-8667(07)00076-7 fatcat:wfqxglrznfb6do3wyittd5pfbi

New Error Tolerant Method to Search Long Repeats in Symbol Sequences [article]

Sergey Tsarev, Michael Sadovsky
2016 arXiv   pre-print
This paper is the extended and detailed version of the presentation at the third International Conference on Algorithms for Computational Biology to be held at Trujillo, Spain, June 21-22, 2016.  ...  A new method to identify all sufficiently long repeating substrings in one or several symbol sequences is proposed.  ...  Znamenskij for useful discussions; the idea of Vernier gauge for acceleration of search was also independently found by him.  ... 
arXiv:1604.01317v1 fatcat:pzll4xkc3rgwxfhg7pbacsadce

:{unav)

Natalia Volfovsky, Brian J Haas, Steven L Salzberg
2016 Genome Biology  
This method has been incorporated into a system that can find repeats in individual genome sequences or sets of sequences, and that can organize those repeats into classes.  ...  Conclusions: We propose a new clustering method for analysis of the repeat data captured in suffix trees.  ...  RepeatMasker uses a database of known repeat sequences and implements a string-matching algorithm to find copies of those repeats in a new sequence.  ... 
doi:10.1186/gb-2001-2-8-research0027 fatcat:7pre7nf62nckvmzpjmhge4kjti

A clustering method for repeat analysis in DNA sequences

N Volfovsky, B J Haas, S L Salzberg
2001 Genome Biology  
This method has been incorporated into a system that can find repeats in individual genome sequences or sets of sequences, and that can organize those repeats into classes.  ...  We propose a new clustering method for analysis of the repeat data captured in suffix trees.  ...  RepeatMasker uses a database of known repeat sequences and implements a string-matching algorithm to find copies of those repeats in a new sequence.  ... 
pmid:11532211 pmcid:PMC55324 fatcat:2tfat26xgfexpboncaej4icqy4

A software program combining sequence motif searches with keywords for finding repeats containing DNA sequences

M. Bilgen, M. Karaca, A. N. Onus, A. G. Ince
2004 Bioinformatics  
We developed a new PC-based standalone software analysis program, combining sequence motif searches with keywords such as organs, tissues, cell lines or development stages for finding exact, inexact and  ...  Tandem Repeats Analyzer 1.5 (TRA) has several advanced repeat search parameters/options over other repeat finder programs as it does not only accept GenBank, FASTA and expressed sequence tag (EST) sequence  ...  Exact Repeats (SSRs): TRA uses a simple algorithm for detecting exact repeats.  ... 
doi:10.1093/bioinformatics/bth410 pmid:15256410 fatcat:y4izljyqtzhv7evovpbhqitjsy

Computation and visualization of degenerate repeats in complete genomes

S Kurtz, E Ohlebusch, C Schleiermacher, J Stoye, R Giegerich
2000 Proceedings. International Conference on Intelligent Systems for Molecular Biology  
A systematic study of repetitive DNA on a genomic or inter-genomic scale requires extensive algorithmic support.  ...  Efficient and complete detection of various types of repeats is provided together with an evaluation of significance, interactive visualization, and simple interfacing to other analysis programs.  ...  (Sagot & Myers 1998 ) present an algorithm for finding tandem arrays (multiple occurrences of substrings similar to a common model in a row).  ... 
pmid:10977084 fatcat:ashwkmenv5hwpndwq564yfhbbq

StarDB

Majed Sahli, Essam Mansour, Panos Kalnis
2015 Proceedings of the VLDB Endowment  
In this paper, we demonstrate StarDB, a distributed database system for analytics on strings. StarDB hides data and system complexities and allows users to focus on analytics.  ...  It uses a comprehensive set of parallel string operations and provides a declarative query language to solve complex queries.  ...  The task translates to generating repeated motifs, generating common motifs, then finding the motifs that are both repeated and common. The following StarQL query is an example for this scenario.  ... 
doi:10.14778/2824032.2824082 fatcat:xyf7qe7smrgaxnbd646hanldim

REPuter: the manifold applications of repeat analysis on a genomic scale

S. Kurtz
2001 Nucleic Acids Research  
This article circumscribes the wide scope of repeat analysis using applications in five different areas of sequence analysis: checking fragment assemblies, searching for low copy repeats, finding unique  ...  A systematic study of repetitive DNA on a genomic or inter-genomic scale requires extensive algorithmic support.  ...  In this way, REPuter provides a simple plausibility check for gene structures predicted by other software tools. 6 , we find that there are still regions free from repeats.  ... 
doi:10.1093/nar/29.22.4633 pmid:11713313 pmcid:PMC92531 fatcat:h34iesjiqvfhxbjycyb36hcbpq

A simple and fast DNA compressor

Giovanni Manzini, Marcella Rastero
2004 Software, Practice & Experience  
For this reason most DNA compressors work by searching and encoding approximate repeats. We depart from this strategy by searching and encoding only exact repeats.  ...  Our approach leads to an algorithm which is an order of magnitude faster than any other algorithm and achieves a compression ratio very close to the best DNA compressors.  ...  Acknowledgments We would like to thank Xin Chen and Stefano Lonardi for their assistance in the testing of their compression algorithms.  ... 
doi:10.1002/spe.619 fatcat:qs6swtwzlrfzhgmwyxwgkds63i

Querying and Mining Strings Made Easy [chapter]

Majed Sahli, Essam Mansour, Panos Kalnis
2017 Lecture Notes in Computer Science  
This paper presents StarQL, a generic and declarative query language for strings.  ...  StarQL is based on a native string data model that allows StarQL to support a large variety of string operations and provide semantic-based query optimization.  ...  This paper proposed StarQL, a declarative query language for strings; and StarIN, a scalable and efficient data structure.  ... 
doi:10.1007/978-3-319-69179-4_1 fatcat:7mwy47qmhzbkbd3m2mgv2qkgmm

A survey of exact motif finding algorithms

Ali Basim Yousif, Hussein Keitan Al-Khafaji, Thekra Abbas
2022 Indonesian Journal of Electrical Engineering and Computer Science  
In this paper, we provide a survey of exact DNA motif finding algorithms and their working principles with a suitable comparison among these algorithms to provide an essential step for researchers in this  ...  Despite the efforts made to date to produce robust algorithms, DNA motif finding remains a difficult task for researchers in this field.  ...  Another reason for the efficiency of PMS8 algorithm Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752  A survey of exact motif finding algorithms (Ali Basim Yousif) 1115 [47] , is its ability to produce  ... 
doi:10.11591/ijeecs.v27.i2.pp1109-1118 fatcat:65uu4r7vnbebbiqjenit26clz4

An efficient algorithm for finding short approximate non-tandem repeats

E. F. Adebiyi, T. Jiang, M. Kaufmann
2001 Bioinformatics  
Using this result, we present a sub-quadratic algorithm for finding all short (i.e., of length O(log N )) approximate repeats.  ...  We give a careful theoretical characterization of the set of seeds (i.e., some maximal exact repeats) required by the algorithm, and prove a sublinear bound on their expected numbers.  ...  Acknowledgments We thank Jop Sibeyn for some discussions on the analysis of the expected length of the longest repeat.  ... 
doi:10.1093/bioinformatics/17.suppl_1.s5 pmid:11472987 fatcat:4vuqdv3ztjahbh5udlmmsruy7u

An algorithm for finding tandem repeats of unspecified pattern size

Gary Benson
1998 Proceedings of the second annual international conference on Computational molecular biology - RECOMB '98  
In this paper, me present a new algorithm for finding tandem repeats in DNA sequences without the need to specify either the pattern or pattern size.  ...  A tandem repeat is two or more contiguous, approtimate copies of a pattern of nucleotides. Tandem repeats occur frequently in the human genome.  ...  Algorithm Overview Our algorithm Suds tandem repeats by observing a collection of matching Ic-tuples at a common distance in a common region of the DNA sequence.  ... 
doi:10.1145/279069.279079 dblp:conf/recomb/Benson98 fatcat:im7oliovznhstcgpezukd4xqbi

Discovering a domain alphabet

Michael D. Schmidt, Hod Lipson
2009 Proceedings of the 11th Annual conference on Genetic and evolutionary computation - GECCO '09  
Here we introduce a method that automatically identifies a small alphabet for a problem domain.  ...  We demonstrate this process on symbolic regression for a variety of physics problems. The method discovers key terms relating to concepts such as energy and momentum.  ...  Therefore, finding large repeated building blocks is a strong indication the building block is a nontrival building block useful throughout the problem domain.  ... 
doi:10.1145/1569901.1570047 dblp:conf/gecco/SchmidtL09 fatcat:l5gnoep4rnh2dp3wmu5g2brkwy

Identifying Inverted Repeat Structure In Dna Sequences Using Correlation Framework

Ravi Gupta, Ankush Mittal, Kuldip Singh
2006 Zenodo  
Suffix tree technique [8] transforms the inverted repeat detection problem to finding longest common extension subsequence.  ...  After all the stopping criteria for the DNA sequence fails, a search is made for an exact contiguous inverted repeat.  ...  C C T G G C C G G A A C G A A C T G C T A C C A G T T C T G T T T C C C T C G G A C C C G C C T T T A A G A G A G A C A A A G G G  ... 
doi:10.5281/zenodo.53006 fatcat:mwbmxcn22vcvrgpyj46f63yb7y
« Previous Showing results 1 — 15 out of 222,222 results