Multi-relational Data Mining for Tetratricopeptide Repeats (TPR)-Like Superfamily Members in Leishmania spp.: Acting-by-Connecting Proteins [chapter]

Karen T. Girão, Fátima C. E. Oliveira, Kaio M. Farias, Italo M. C. Maia, Samara C. Silva, Carla R. F. Gadelha, Laura D. G. Carneiro, Ana C. L. Pacheco, Michel T. Kamimura, Michely C. Diniz, Maria C. Silva, Diana M. Oliveira
2008 Lecture Notes in Computer Science  
The multi-relational data mining (MRDM) approach looks for patterns that involve multiple tables from a relational database made of complex/structured objects whose normalized representation does require multiple tables. We have applied MRDM methods (relational association rule discovery and probabilistic relational models) with hidden Markov models (HMMs) and Viterbi algorithm (VA) to mine tetratricopeptide repeat (TPR), pentatricopeptide (PPR) and half-a-TPR (HAT) in genomes of pathogenic
more » ... ozoa Leishmania. TPR is a protein-protein interaction module and TPRcontaining proteins (TPRPs) act as scaffolds for the assembly of different multiprotein complexes. Our aim is to build a great panel of the TPR-like superfamily of Leishmania. Distributed relational state representations for complex stochastic processes were applied to identification, clustering and classification of Leishmania genes and we were able to detect putative 104 TPRPs, 36 PPRPs and 08 HATPs, comprising the TPR-like superfamily. We have also compared currently available resources (Pfam, SMART, SUPER-FAMILY and TPRpred) with our approach (MRDM/HMM/VA).
doi:10.1007/978-3-540-88436-1_31 fatcat:72sozzbi3vajtdkpfsyoznilw4