New in protein structure and function annotation: hotspots, single nucleotide polymorphisms and the 'Deep Web'
Current opinion in drug discovery & development
The rapidly increasing quantity of protein sequence data continues to widen the gap between available sequences and annotations. Comparative modeling suggests some aspects of the 3D structures of approximately half of all known proteins; homology- and network-based inferences annotate some aspect of function for a similar fraction of the proteome. For most known protein sequences, however, there is detailed knowledge about neither their function nor their structure. Comprehensive efforts
... the expert curation of sequence annotations have failed to meet the demand of the rapidly increasing number of available sequences. Only the automated prediction of protein function in the absence of homology can close the gap between available sequences and annotations in the foreseeable future. This review focuses on two novel methods for automated annotation, and briefly presents an outlook on how modern web software may revolutionize the field of protein sequence annotation. First, predictions of protein binding sites and functional hotspots, and the evolution of these into the most successful type of prediction of protein function from sequence will be discussed. Second, a new tool, comprehensive in silico mutagenesis, which contributes important novel predictions of function and at the same time prepares for the onset of the next sequencing revolution, will be described. While these two new sub-fields of protein prediction represent the breakthroughs that have been achieved methodologically, it will then be argued that a different development might further change the way biomedical researchers benefit from annotations: modern web software can connect the worldwide web in any browser with the 'Deep Web' (ie, proprietary data resources). The availability of this direct connection, and the resulting access to a wealth of data, may impact drug discovery and development more than any existing method that contributes to protein annotation.