Filters








2,332 Hits in 4.9 sec

Efficient use of unlabeled data for protein sequence classification: a comparative study

Pavel Kuksa, Pai-Hsi Huang, Vladimir Pavlovic
2009 BMC Bioinformatics  
Results: Combined with state-of-the-art string kernels, our proposed computational framework achieves very accurate semi-supervised protein remote fold and homology detection on three large unlabeled databases  ...  For example, predictive models based on string kernels trained on sequences known to belong to particular folds or superfamilies, the so-called labeled data set, can attain significantly improved accuracy  ...  Acknowledgements This article has been published as part of BMC Bioinformatics Volume 10 Supplement 4, 2009: Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2008  ... 
doi:10.1186/1471-2105-10-s4-s2 pmid:19426450 pmcid:PMC2681072 fatcat:xmmmxzyyd5f65ee36nh3s257qm

On the Role of Local Matching for Efficient Semi-supervised Protein Sequence Classification

Pavel Kuksa, Pai-Hsi Huang, Vladimir Pavlovic
2008 2008 IEEE International Conference on Bioinformatics and Biomedicine  
As overly-represented sequences in large uncurated databases may bias kernel estimations that rely on unlabeled data, we also propose a method to remove this bias and improve performance of resulting classifiers  ...  Combined with a computationally efficient sparse family of string kernels, our proposed framework achieves state-ofthe-art accuracy in semi-supervised protein remote homology detection on three large unlabeled  ...  Introduction In this work we address the problem of predicting protein remote homology using only the primary sequence.  ... 
doi:10.1109/bibm.2008.52 dblp:conf/bibm/KuksaHP08 fatcat:6nijxiglfrez7kstgjagvpw4fa

Automatic prediction of protein function

K. O. Wrzeszczynski, Y. Ofran, B. Rost, R. Nair, J. Liu
2003 Cellular and Molecular Life Sciences (CMLS)  
An outstanding new method predicts classes of cellular function directly from sequence.  ...  Most methods annotating protein function utilise sequence homology to proteins of experimentally known function. Such a homology-based annotation transfer is problematic and limited in scope.  ...  Functional classes can be predicted from sequence An interesting hybrid system uses inductive logic programming to predict functional classes with and without homology to experimentally annotated proteins  ... 
doi:10.1007/s00018-003-3114-8 pmid:14685688 fatcat:wc72qcih5bgz7hyezny77p7bou

Computational prediction of protein interfaces: A review of data driven methods

Li C. Xue, Drena Dobbs, Alexandre M.J.J. Bonvin, Vasant Honavar
2015 FEBS Letters  
Valencia Keywords: Protein-protein interaction Machine learning Docking Partner-specific interface prediction Cross validation on protein level Cross validation on instance level Evaluation caveats a b  ...  Here, we review the basic concepts, principles and recent advances in computational approaches to the analysis and prediction of protein-protein interfaces.  ...  When the protein structure is not available (which is the case for most proteins), one has to rely on bioinformatics methods to predict solvent accessibilities. Surface shape.  ... 
doi:10.1016/j.febslet.2015.10.003 pmid:26460190 pmcid:PMC4655202 fatcat:7vv625xe4bg3xji6wdig2coue4

Predicting flexible length linear B-cell epitopes

Yasser El-Manzalawy, Drena Dobbs, Vasant Honavar
2008 Computational systems bioinformatics. Computational Systems Bioinformatics Conference  
Based on our empirical comparisons, we propose FBCPred, a novel method for predicting flexible length linear B-cell epitopes using the subsequence kernel.  ...  Therefore, computational tools for reliably predicting B-cell epitopes are highly desirable. We explore two machine learning approaches for predicting flexible length linear B-cell epitopes.  ...  Previous methods for predicting linear B-cell epitopes (e.g., 15, 17, 19, 18, 20 ) have been evaluated on datasets of unique epitopes without applying any homology reduction procedure as a pre-processing  ... 
pmid:19642274 pmcid:PMC3400678 fatcat:nb6jdbej5zbvlpy2dvtwzmxaxi

Uncoiling CNLs: Structure/function approaches to understanding CC domain function in plant NLRs

Adam R Bentham, Rafał Zdrzałek, Juan Carlos De la Concepcion, Mark J Banfield
2018 Plant and Cell Physiology  
Finally, we discuss whether using homology modeling is useful to describe putative CC domain function in CNLs through parallels with the functions of previously characterized helical adaptor proteins.  ...  The two major plant NLR classes are defined by the presence of either a Toll/interleukin-1 receptor (TIR) or a coiled-coil (CC) domain at their Nterminus (TNLs and CNLs).  ...  One such technique is SEC, but this is a purely in vitro assay that relies on heterologous expression and purification of the protein(s) of interest, mostly commonly Escherichia coli.  ... 
doi:10.1093/pcp/pcy185 pmid:30192967 fatcat:2gvarcjxozc7ndpndwyf3fimpy

Predicting linear B‐cell epitopes using string kernels

Yasser EL‐Manzalawy, Drena Dobbs, Vasant Honavar
2008 Journal of Molecular Recognition  
We evaluated Support Vector Machine (SVM) classifiers trained utilizing five different kernel methods using fivefold cross-validation on a homology-reduced data set of 701 linear B-cell epitopes, extracted  ...  Based on the results of our computational experiments, we propose BCPred, a novel method for predicting linear B-cell epitopes using the subsequence kernel.  ...  that rely on analysis of amino acid physicochemical properties.  ... 
doi:10.1002/jmr.893 pmid:18496882 pmcid:PMC2683948 fatcat:mv64qg3u7bggpn25v5oslvwgtm

A Deeply Glimpse into Protein Fold Recognition

Marwa Mohammed M. Ghareeb, Ahmed Sharaf Eldin, Taysir Hassan A. Soliman, Mohammed Ebrahim Marie
2013 Zenodo  
Thus, the need of extracting structural information through computational analysis of protein sequences has become very important, especially, the prediction of the fold of a query protein from its primary  ...  This paper puts a spot on this growing field and covers the main approaches and perspectives to handle this problem.Read Complete Article at ijSciences: V2201305187  ...  Thomas W. proposed protein fold class prediction using neural networks with tailored early-stopping [22] .  ... 
doi:10.5281/zenodo.3348233 fatcat:sui7cakaaraqdh2a6pifgzrdmm

Identification of host-microbe interaction factors in the genomes of soft rot-associated pathogens Dickeya dadantii 3937 and Pectobacterium carotovorum WPP14 with supervised machine learning

Bing Ma, Amy O Charkowski, Jeremy D Glasner, Nicole T Perna
2014 BMC Genomics  
Computational approaches to identify virulence genes often rely on two strategies: searching for sequence similarity to known host-microbe interaction factors from other organisms, and identifying islands  ...  Our approach achieved greater than 90% precision and a recall rate over 80% in 10-fold cross validation tests.  ...  Results and discussion Many computational methods have been used to identify gene functions involved in host-microbe interaction, and most of them rely primarily on homology-based searches using known  ... 
doi:10.1186/1471-2164-15-508 pmid:24952641 pmcid:PMC4079955 fatcat:jpznt6plnfge5gqwnwydavxii4

Role of the Biomolecular Energy Gap in Protein Design, Structure, and Evolution

Sarel J. Fleishman, David Baker
2012 Cell  
The folding of natural biopolymers into unique three-dimensional structures that determine their function is remarkable considering the vast number of alternative states and requires a large gap in the  ...  This Perspective explores the implications of this energy gap for computing the structures of naturally occurring biopolymers, designing proteins with new structures and functions, and optimally integrating  ...  Most methods for designing protein-protein interfaces have relied on forward design by generating sequences that are predicted to bind tightly to their targets.  ... 
doi:10.1016/j.cell.2012.03.016 pmid:22500796 fatcat:hj2gccfrxrgplc6pmycww42jum

Speeding disease gene discovery by sequence based candidate prioritization

Euan A Adie, Richard R Adams, Kathryn L Evans, David J Porteous, Ben S Pickard
2005 BMC Bioinformatics  
On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the time and twenty-fold 11% of the time.  ...  It performs markedly better than the single existing sequence-based classifier on novel data.  ...  Acknowledgements The authors wish to thank Colin Semple (MRC Human Genetics Unit, Edinburgh) for discussion and his comments on the manuscript.  ... 
doi:10.1186/1471-2105-6-55 pmid:15766383 pmcid:PMC1274252 fatcat:lf7nrd6uufdf7hkd5n2xhmvznm

PREDICTING FLEXIBLE LENGTH LINEAR B-CELL EPITOPES

Yasser EL-Manzalawy, Drena Dobbs, Vasant Honavar
2008 Computational Systems Bioinformatics  
Based on our empirical comparisons, we propose FBCPred, a novel method for predicting flexible length linear B-cell epitopes using the subsequence kernel.  ...  Therefore, computational tools for reliably predicting B-cell epitopes are highly desirable. We explore two machine learning approaches for predicting flexible length linear B-cell epitopes.  ...  Previous methods for predicting linear B-cell epitopes (e.g., 15, 17, 19, 18, 20 ) have been evaluated on datasets of unique epitopes without applying any homology reduction procedure as a pre-processing  ... 
doi:10.1142/9781848162648_0011 fatcat:fr7xdueuybh5vjozubkb2r3ewa

Machine Learning for Protein Function [article]

Dan Ofer
2016 arXiv   pre-print
In this thesis, I focused on feature engineering and machine learning methods for identifying diverse classes of proteins that share functional relatedness but little sequence or structural similarity,  ...  I aim to identify functional protein classes solely using unannotated protein primary sequences from any organism.  ...  Without her I would never have entered the field, or encountered the myriad opportunities it led me to. She gave me the opportunity to try and learn, despite my beginning as a Tabula-Rasa.  ... 
arXiv:1603.02021v1 fatcat:ickyv6nsmjgu5cgxzpafrjjzem

Computational Methods for Prediction of Protein-Protein Interaction Sites [chapter]

Jarek Meller, Alexey Porollo
2012 Protein-Protein Interactions - Computational and Experimental Tools  
These cases may be challenging to prediction methods that rely on high resolution data with all atoms resolved.  ...  Sequence homology-based approach assumes that similar protein sequences adopt the same 3D fold and carry the same function, which is not always true.  ... 
doi:10.5772/36716 fatcat:n6lgxahu3jaexma2v42t5bxqea

Distinguishing Enzyme Structures from Non-enzymes Without Alignments

Paul D. Dobson, Andrew J. Doig
2003 Journal of Molecular Biology  
Here, we show that protein function can be predicted as enzymatic or not without resorting to alignments.  ...  Current methods for predicting protein function are mostly reliant on identifying a similar protein of known function.  ...  overly directed towards any particular class.  ... 
doi:10.1016/s0022-2836(03)00628-4 pmid:12850146 fatcat:tvehprwyw5ccfivolhbvdaoro4
« Previous Showing results 1 — 15 out of 2,332 results