A FAST SEARCH METHOD FOR DNA SEQUENCE DATABASE USING HISTOGRAM INFORMATION
English

QIU CHEN, KOJI KOTANI, FEIFEI LEE, TADAHIRO OHMI
2011 International Journal of Bioinformatics Research  
DNA sequence search is a fundamental topic in bioinformatics. The Smith-Waterman algorithm achieved highest accuracy among various sequence alignment tools, but it usually spends much computational time to search on large DNA sequence database. On the contrary, BLAST and FASTA have improved the search speed by using heuristic approaches, by there is a possibility of missing an alignment or giving inaccurate output. This paper presents an efficient hierarchical method to improve the search speed
more » ... while the accurate is being kept constant. For a given query sequence, firstly, a fast histogram based method is used to scan the sequences in the database. A large number of DNA sequences with low similarity will be excluded for latter searching. The Smith-Waterman algorithm is then applied to each remainder sequences. Experimental results show the proposed method combining histogram information and Smith-Waterman algorithm is a more efficient algorithm for DNA sequence search.
doi:10.9735/0975-3087.3.1.161-166 fatcat:2yegrsul7nfqvlempmrmiofxly