A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2007; you can also visit the original URL.
The file type is application/pdf
.
Fast Motif Search in Protein Sequence Databases
[chapter]
2006
Lecture Notes in Computer Science
Regular expression pattern matching is widely used in computational biology. Searching through a database of sequences for a motif (a simple regular expression), or its variations is an important interactive process which requires fast motif-matching algorithms. In this paper, we explore and evaluate various representations of the database of sequences using suffix trees for two types of query problems for a given regular expression: 1) Find the first match, and 2) Find all matches. Answering
doi:10.1007/11753728_67
fatcat:ot4lthbmfjhk7nzbtxwaq7sfsm