Functional grouping based on signatures in protein termini

Iris Bahir, Michal Linial
2006 Proteins: Structure, Function, and Bioinformatics  
The two ends of each protein are known as the amino (N-) and carboxyl (C-) termini. Short signatures in a protein's termini often carry vital cellular function. No systematic research has been conducted to address the importance of short signatures (3 to 10 amino acids) in protein termini at the proteomic level. Specifically, it is unknown whether such signatures are evolutionarily conserved, and if so, whether this conservation confers shared biological functions. Current signature detection
more » ... thods fail to detect such short signatures due to inadequate statistical scores. The findings presented in this study strongly support the notion that functional significance of protein sets may be captured by short signatures at their termini. A positional search method was applied to over one million proteins from the UniProt database. The result is a collection of about a thousand significant signature groups (SIGs) that include previously identified as well as many novel signatures in protein termini. These SIGs represent protein sets with minimal or no overall sequence similarity excepting the similarity at their termini. The most significant SIGs are assigned by their strong correspondence to functional annotations derived from external databases such as Gene Ontology. Each of the SIGs is associated with the statistical significance of its functional association. These SIGs provide a valuable source for testing previously overlooked signatures in protein termini and allow for the investigation of the role played by such signatures throughout evolution. The SIGs archive and advanced search options are available at http://www.proteus.cs.huji.ac.il.
doi:10.1002/prot.20903 pmid:16475191 fatcat:d7kj2khinzfqzgjewgiu7cvo3q