Filters








83,951 Hits in 5.6 sec

Prediction of enzyme classification from protein sequence without the use of sequence similarity

M des Jardins, P D Karp, M Krummenacker, T J Lee, C A Ouzounis
1997 Proceedings. International Conference on Intelligent Systems for Molecular Biology  
We describe a novel approach for predicting the function of a protein from its amino-acid sequence.  ...  Our approach uses machine learning (ML) techniques to induce classifiers that predict the EC class of an enzyme from features extracted from its primary sequence.  ...  Acknowledgments This work was supported by SRI International and by the Human Frontiers Science Program.  ... 
pmid:9322021 fatcat:ft7y5axhejh5nie7tyupnrtn44

Quantitative assessment of protein function prediction programs

B.N. Rodrigues, M.B.R. Steffens, R.T. Raittz, I.C.R. Santos-Weiss, J.N. Marchaukoski
2015 Genetics and Molecular Research  
Fast prediction of protein function is essential for highthroughput sequencing analysis.  ...  Bioinformatic resources provide cheaper and faster techniques for function prediction and have helped to accelerate the process of protein sequence characterization.  ...  Conflicts of interest The authors declare no conflict of interest.  ... 
doi:10.4238/2015.december.21.28 pmid:26782400 fatcat:t2grrssmwvfytdl2h5g4ekwts4

Predicting functional family of novel enzymes irrespective of sequence similarity: a statistical learning approach

L. Y. Han, C. Z. Cai, Z. L. Ji, Z. W. Cao, J. Cui, Y. Z. Chen
2004 Nucleic Acids Research  
Several groups have employed a statistical learning method, support vector machines (SVMs), for predicting protein functional family directly from sequence irrespective of sequence similarity.  ...  The function of a protein that has no sequence homolog of known function is difficult to assign on the basis of sequence similarity.  ...  SVM has also been used for classification of enzymes into structural families irrespective of sequence similarity, and the accuracy for assignment of 1178 enzymes is 80% (32) .  ... 
doi:10.1093/nar/gkh984 pmid:15585667 pmcid:PMC535691 fatcat:kczn4f6v7zg2topxovwnoo7oza

Universal Deep Sequence Models for Protein Classification [article]

Nils Strodthoff, Patrick Wagner, Markus Wenzel, Wojciech Samek
2019 bioRxiv   pre-print
We argue that a similar level of performance can be reached by leveraging the vast amount of unlabeled protein sequence data available from protein sequence databases using a generic architecture that  ...  Inferring the properties of protein from its amino acid sequence is one of the key problems in bioinformatics.  ...  UDSMProt was implemented using Pytorch [Paszke et al., 2017] and fast.ai and CNN baseline models were implemented in Keras [Chollet et al., 2015] .  ... 
doi:10.1101/704874 fatcat:cy6u6i2dl5c3pauz5z2ic4pehe

Prediction of Human Protein Function from Post-translational Modifications and Localization Features

L.J. Jensen, R. Gupta, N. Blom, D. Devos, J. Tamames, C. Kesmir, H. Nielsen, H.H. Stærfeldt, K. Rapacki, C. Workman, C.A.F. Andersen, S. Knudsen (+3 others)
2002 Journal of Molecular Biology  
We show that strategies for the elucidation of protein function may benefit from a number of functional attributes that are more directly related to the linear sequence of amino acids, and hence easier  ...  We have developed an entirely sequence-based method that identifies and integrates relevant features that can be used to assign proteins of unknown function to functional classes, and enzyme categories  ...  Attributes useful in function prediction must not only correlate well with the functional classification scheme, but must also be predictable from sequence with reasonable accuracy.  ... 
doi:10.1016/s0022-2836(02)00379-0 pmid:12079362 fatcat:sxc66jdxnrcxxi7sldeqqqvi4q

Optimized bio-inspired kernels with twin support vector machine using low identity sequences to solve imbalance multiclass classification

S.K. Guramand, R.D.R. Saedudin, R. Hassan, S. Kasim, R. Ramlan, Baraa Wasfi Salim
2019 Journal of environmental biology  
The function of enzymes is performed differently depending on their bio-chemical mechanisms and important to the prediction of protein structure and function.  ...  the biological information regarding the protein evolution in the classification process.  ...  The authors also thank the anonymous viewers for the feedback.  ... 
doi:10.22438/jeb/40/3(si)/sp-21 fatcat:msd4wqeujfahvh34nmo254z5mq

A Machine Learning Methodology for Enzyme Functional Classification Combining Structural and Protein Sequence Descriptors [chapter]

Afshine Amidi, Shervine Amidi, Dimitrios Vlachakis, Nikos Paragios, Evangelia I. Zacharaki
2016 Lecture Notes in Computer Science  
In this paper we present a bioinformatics approach that exploits both structural representation and protein sequence similarity in order to predict in silico the EC number of an enzyme using machine learning  ...  Many methods have been developed to quantify the similarity between two protein sequences which are either based on sequence alignment [11] or provide a similarity score without performing prior alignment  ... 
doi:10.1007/978-3-319-31744-1_63 fatcat:iha5dk6s3vagfpyzhu6jrf7qre

UDSMProt: Universal Deep Sequence Models for Protein Classification

2020 Bioinformatics  
Inferring the properties of a protein from its amino acid sequence is one of the key problems in bioinformatics.  ...  We put forward a universal deep sequence model that is pretrained on unlabeled protein sequences from Swiss-Prot and finetuned on protein classification tasks.  ...  UDSMProt was implemented using PyTorch and fast.ai and the CNN baseline models were implemented in Keras.  ... 
doi:10.1093/bioinformatics/btaa003 pmid:31913448 pmcid:PMC7178389 fatcat:lbvihdnq2jca7fkqgac26jieam

FunFam protein families improve residue level molecular function prediction

Linus Scheibenreif, Maria Littmann, Christine Orengo, Burkhard Rost
2019 BMC Bioinformatics  
We analyzed the similarity of binding site annotations in these FunFams and incorporated FunFams into the prediction of protein binding residues.  ...  The CATH database provides a hierarchical classification of protein domain structures including a sub-classification of superfamilies into functional families (FunFams).  ...  Particular thanks to Sayoni Das (UCL) for building FunFams and for help with using this resource. Thanks also to the anonymous reviewers who helped substantially to improve the paper.  ... 
doi:10.1186/s12859-019-2988-x fatcat:wsct2t3sszczhhbtalde26bio4

EnzDP: Improved enzyme annotation for metabolic network reconstruction based on domain composition profiles

Nam-Ninh Nguyen, Sriganesh Srihari, Hon Wai Leong, Ket-Fah Chong
2015 Journal of Bioinformatics and Computational Biology  
The DEAS score is used to calculate the similarity between proteins, which is then used in clustering procedure, instead of using sequence similarity score.  ...  We improve the enzyme annotation protocol using a stringent classification procedure, and by choosing optimal threshold settings and checking for active sites.  ...  This discrepancy comes from the fact that PRIAM and EFICAz use sequence similarity for clustering, while EnzDP uses domain architecture for the similarity metric.  ... 
doi:10.1142/s0219720015430039 pmid:26542446 fatcat:n6h2douhpvg5xomwvv6ebdgvou

GrAPFI: predicting enzymatic function of proteins from domain similarity graphs

Bishnu Sarker, David W. Ritchie, Sabeur Aridhi
2020 BMC Bioinformatics  
An amendment to this paper has been published and can be accessed via the original article.  ...  Acknowledgements This work was partially supported by the CNRS-INRIA/FAPs project "TempoGraphs" (PRC2243). This work is dedicated to the memory of David W. Ritchie, who recently passed away.  ...  index of 1 and less than 1 respectively classification, To evaluate the method, we have used a well defined dataset of enzyme and non-enzyme proteins curated from UniprotKB [1] .  ... 
doi:10.1186/s12859-020-3460-7 pmid:32349654 fatcat:2e7ylth6brfk3hycj272dzuomq

Distinguishing Enzyme Structures from Non-enzymes Without Alignments

Paul D. Dobson, Andrew J. Doig
2003 Journal of Molecular Biology  
The most useful features for distinguishing enzymes from nonenzymes are secondary-structure content, amino acid frequencies, number of disulphide bonds and size of the largest cleft.  ...  Validation of the method shows that the function can be predicted to an accuracy of 77% using 52 features to describe each protein.  ...  We thank Ben Stapley for helpful discussions and Kristoffer Rapacki of the Center for Biological Sequence Analysis, Technical University of Denmark for assistance with the ProtFun results.  ... 
doi:10.1016/s0022-2836(03)00628-4 pmid:12850146 fatcat:tvehprwyw5ccfivolhbvdaoro4

ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature

Alperen Dalkiran, Ahmet Sureyya Rifaioglu, Maria Jesus Martin, Rengul Cetin-Atalay, Volkan Atalay, Tunca Doğan
2018 BMC Bioinformatics  
Enzyme vs. non-enzyme classification is incorporated into ECPred along with a hierarchical prediction approach exploiting the tree structure of the EC nomenclature.  ...  The automated prediction of the enzymatic functions of uncharacterized proteins is a crucial topic in bioinformatics.  ...  Competing interests The authors declare that they have no competing interests.  ... 
doi:10.1186/s12859-018-2368-y pmid:30241466 pmcid:PMC6150975 fatcat:vslvfrev35gmhhxm2jxxldjcdq

Prediction of protein function from protein sequence and structure

James C. Whisstock, Arthur M. Lesk
2003 Quarterly Reviews of Biophysics (print)  
Nevertheless, prediction of protein function from sequence and structure is a difficult problem, because homologous proteins often have different functions.  ...  Many methods of function prediction rely on identifying similarity in sequence and/or structure between a protein of unknown function and one or more well-understood proteins.  ...  Function prediction from sequence similarity can take advantage of multiple sources of information to back up the prediction from levels of sequence identity alone, and to improve the results in cases  ... 
doi:10.1017/s0033583503003901 pmid:15029827 fatcat:e3x65am2tvbkdfhcv6d5v5i3je

Improving classification in protein structure databases using text mining

Antonis Koussounadis, Oliver C Redfern, David T Jones
2009 BMC Bioinformatics  
The classification of protein domains in the CATH resource is primarily based on structural comparisons, sequence similarity and manual analysis.  ...  Finally when only the highest scoring predictions were used to infer classification, an extra 4.2% of correct decisions were made by the combined classifier.  ...  Acknowledgements AK was funded by BBSRC grant BBC5072531 and the BioSapiens Network of Excellence (funded by the European Commission within its FP6 programme, under the thematic area Life Sciences, Genomics  ... 
doi:10.1186/1471-2105-10-129 pmid:19416501 pmcid:PMC2688513 fatcat:7ai2e6ep5na33mqfrfkk5snezi
« Previous Showing results 1 — 15 out of 83,951 results