Optimizing the size of the sequence profiles to increase the accuracy of protein sequence alignments generated by profile-profile algorithms

A. Poleksic, M. Fienup
2008 Bioinformatics  
Motivation: Profile-based protein homology detection algorithms are valuable tools in genome annotation and protein classification. By utilizing information present in the sequences of homologous proteins, profile-based methods are often able to detect extremely weak relationships between protein sequences, as evidenced by the largescale benchmarking experiments such as CASP and LiveBench. Results: We study the relationship between the sensitivity of a profile-profile method and the size of the
more » ... and the size of the sequence profile, which is defined as the average number of different residue types observed at the profile's positions. We also demonstrate that improvements in the sensitivity of a profile-profile method can be made by incorporating a profile-dependent scoring scheme, such as position-specific background frequencies. The techniques presented in this article are implemented in an alignment algorithm UNI-FOLD. When tested against other well-established methods for fold recognition, UNI-FOLD shows increased sensitivity and specificity in detecting remote relationships between protein sequences. Availability: UNI-FOLD web server can be accessed at http://
doi:10.1093/bioinformatics/btn097 pmid:18337259 fatcat:h2efqo2mcvbrbaebixf4wt5sdi