Improved Alignment of Protein Sequences Based on Common Parts [chapter]

David Hoksza
Bioinformatics Research and Applications  
In the last twenty years, protein databases have been growing exponentially. To speed up the search, heuristic approaches have been proposed and their accuracy has been steadily growing, but exact search is still needed in some cases. The only exact search algorithm remains SSEARCH (or it's clones) which sequentially scans database of protein sequences, and performs full alignment against each of the sequences. Due to the need of the exact search, we focus on improving the sequential search
more » ... rithm. We decrease the costs needed to compute the alignment of pair of protein sequences when used with large databases. This is achieved by reusing alignment calculations of common parts of the sequences without loss of accuracy. With this method, we reduced the computational costs by up to 20 % depending on the database size and subset used. We also implemented approximate search which further reduced computational costs for the the sake of some accuracy loss.
doi:10.1007/978-3-540-79450-9_9 dblp:conf/isbra/Hoksza08 fatcat:u6l2moi7tbeldgjgxtp7yksp64