Specific Peptides Predict Protein Classification [article]

David Horn, Uri Weingart
2022 bioRxiv   pre-print
The methodology of Specific Peptides (SP) has been introduced within the context of enzymes. It is based on an unsupervised machine leaning (ML) tool for motif extraction, followed by supervised annotation of the motifs. In the case of enzymes, the classifier is the Enzyme Classification (EC) number. Here we demonstrate that this method reaches precision of 96.5% and recall of 89.1% on presently available protein sequences. We also apply this method to two other protein families, GPCR and ZF,
more » ... nd their corresponding SPs, and provide the code for searching any protein sequence for its classification under any such family.
doi:10.1101/2022.02.04.479085 fatcat:vmgwewbsjngwllriiv42wwppvq