Filters








161,696 Hits in 8.3 sec

A Structured Family of Clustering and Tree Construction Methods

David Bryant, Vincent Berry
2001 Advances in Applied Mathematics  
We therefore explore extensions of Apresjan clustering to a family of related hierarchical clustering methods.  ...  The extensions are shown to be closely connected with the well-known single and average linkage tree constructions. A dual family of methods for classification by splits is also presented.  ...  ACKNOWLEDGMENTS We thank Olivier Gascuel, Vincent Moulton, and Mike Steel for reading through versions of this manuscript.  ... 
doi:10.1006/aama.2001.0758 fatcat:t45stcnfr5fvpmx6ujhxyhti7q

PhyloFacts: an online structural phylogenomic encyclopedia for protein functional and structural classification

Nandini Krishnamurthy, Duncan P Brown, Dan Kirshner, Kimmen Sjölander
2006 Genome Biology  
The Berkeley Phylogenomics Group presents PhyloFacts, a structural phylogenomic encyclopedia containing almost 10,000 'books' for protein families and domains, with pre-calculated structural, functional  ...  PhyloFacts enables biologists to avoid the systematic errors associated with function prediction by homology through the integration of a variety of experimental data and bioinformatics methods in an evolutionary  ...  Acknowledgements This work was supported by a Presidential Early Career Award for Scientists and Engineers (PECASE) from the National Science Foundation, and by an R01 from the National Human Genome Research  ... 
doi:10.1186/gb-2006-7-9-r83 pmid:16973001 pmcid:PMC1794543 fatcat:uib3z34bjzfcneno7favzrp6ri

DNA Familial Binding Profiles Made Easy: Comparison of Various Motif Alignment and Clustering Strategies

Shaun Mahony, Philip E. Auron, Panayiotis V. Benos
2007 PLoS Computational Biology  
A new method for automatic determination of the optimal number of clusters is developed and applied in the construction of a new set of familial binding profiles which improves upon TF classification accuracy  ...  In addition, multiple-alignment strategies for binding profiles and tree-building methods are tested for their efficiency in constructing generalized binding models.  ...  PVB was supported by US National Institutes of Health grants 1R01LM007994-01 and RR014214 and by TATRC/DoD USAMRAA Prime Award W81XWH-05-2-0066. PEA was supported by NIH grant CA06668544.  ... 
doi:10.1371/journal.pcbi.0030061 pmid:17397256 pmcid:PMC1848003 fatcat:4kkz2ufbeneifknrgofcqlapcm

DNA Familial Binding Profiles Made Easy: Comparison of Various Motif Alignment and Clustering Strategies

Shaun Mahony, Philip Auron, Panayiotis (Takis) V. Benos
2005 PLoS Computational Biology  
A new method for automatic determination of the optimal number of clusters is developed and applied in the construction of a new set of familial binding profiles which improves upon TF classification accuracy  ...  In addition, multiple-alignment strategies for binding profiles and tree-building methods are tested for their efficiency in constructing generalized binding models.  ...  PVB was supported by US National Institutes of Health grants 1R01LM007994-01 and RR014214 and by TATRC/DoD USAMRAA Prime Award W81XWH-05-2-0066. PEA was supported by NIH grant CA06668544.  ... 
doi:10.1371/journal.pcbi.0030061.eor fatcat:pzm47xapzzfqdi6msjwlrrcg2q

Product Family Formation for Reconfigurable Assembly Systems

Mohamed Kashkoush, Hoda ElMaraghy
2014 Procedia CIRP  
A novel consensus tree-based method is applied to find the best aggregation for the three different hierarchical clustering trees. The proposed method is applied to an example of eight products.  ...  Appropriate formation of product families for Reconfigurable Manufacturing Systems (RMS) is of a great importance for a cost-effective and productive manufacturing.  ...  Kashkoush and ELMaraghy [9] have recently developed a GA-based method for constructing the consensus tree for any given set of assembly sequence trees and used it in a retrieval-based method for assembly  ... 
doi:10.1016/j.procir.2014.01.131 fatcat:yz3xhdiy4veqjfuxr65quxxmsm

Deep forest ensemble learning for classification of alignments of non-coding RNA sequences based on multi-view structure representations

Ying Li, Qi Zhang, Zhaoqian Liu, Cankun Wang, Siyu Han, Qin Ma, Wei Du
2020 Briefings in Bioinformatics  
Additionally, we apply GCFM to construct a phylogenetic tree of ncRNA and predict the probability of interactions between RNAs.  ...  Furthermore, the clustering of ncRNA families is carried out based on the classification matrix generated from GCFM.  ...  Funding This work was supported by the National Natural Science Foundation of China (61872418, 61972175 and 71774154) and Natural Science Foundation of Jilin Province (20180101331JC and 20180101050JC).  ... 
doi:10.1093/bib/bbaa354 pmid:33367506 pmcid:PMC8294561 fatcat:jnpgxrslirb2dcjfjhjzc4v6sq

Phylogenomic inference of protein molecular function: advances and challenges

K. Sjolander
2004 Bioinformatics  
Motivation: Protein families evolve a multiplicity of functions through gene duplication, speciation and other processes.  ...  Phylogenomic analysis-combining phylogenetic tree construction, integration of experimental data and differentiation of orthologs and paralogs-has been proposed to address these errors and improve the  ...  This work was supported in part by Grant no. 0238311 from the National Science Foundation, and by Grant no. R01 HG002769-01 from the National Institutes of Health.  ... 
doi:10.1093/bioinformatics/bth021 pmid:14734307 fatcat:juaxrnw2mngbfbflxh6kkvvwqu

Safe Functional Inference for Uncharacterized Viral Proteins

Michal Linial, Yaniv Loewenstein, Michal Linial
2008 Nature Precedings  
Superfamily tree search illustration. Pink and blue represent proteins in homologous families A and B, while green and black denote other families C and D.  ...  A and C coincide on a multi-domain protein (pink and green protein) which may induce false-transitivity -falsely clustering A with nonhomologous C due to local BLAST similarities of multi-domain protein  ...  The tree construction is fully automatic, and is based only on reported BLAST similarities among clustered sequences.  ... 
doi:10.1038/npre.2008.2187 fatcat:og5bi56rebdb5konqg24sy4bwm

Safe Functional Inference for Uncharacterized Viral Proteins

Michal Linial, Yaniv Loewenstein
2008 Nature Precedings  
Superfamily tree search illustration. Pink and blue represent proteins in homologous families A and B, while green and black denote other families C and D.  ...  A and C coincide on a multi-domain protein (pink and green protein) which may induce false-transitivity -falsely clustering A with nonhomologous C due to local BLAST similarities of multi-domain protein  ...  The tree construction is fully automatic, and is based only on reported BLAST similarities among clustered sequences.  ... 
doi:10.1038/npre.2008.2187.1 fatcat:52l66tq47zhgjlgta5wpxgyjvq

Topological Analysis of Syntactic Structures [article]

Alexander Port, Taelin Karidi, Matilde Marcolli
2019 arXiv   pre-print
We analyze relations between syntactic parameters in terms of dimensionality, of hierarchical clustering structures, and of non-trivial loops.  ...  We use the persistent homology method of topological data analysis and dimensional analysis techniques to study data of syntactic structures of world languages.  ...  causes a lot of incorrect placements of languages both within and across subfamilies, as already observed with other tree construction methods in [39] .  ... 
arXiv:1903.05181v1 fatcat:i2rqz3kzcjed3kh7ygd6e3t25u

Deep hierarchical embedding for simultaneous modeling of GPCR proteins in a unified metric space

Taeheon Lee, Sangseon Lee, Minji Kang, Sun Kim
2021 Scientific Reports  
However, modeling of GPCR families has been performed separately for each of the family, subfamily, and sub-subfamily level.  ...  In this study, we propose DeepHier, a deep learning model to simultaneously learn representations of GPCR family hierarchy from the protein sequences with a unified single model.  ...  tree 44 .  ... 
doi:10.1038/s41598-021-88623-8 pmid:33953216 fatcat:cxewfy4ofje5nermpfr6ndmvn4

Clustering Rfam 10.1: Clans, Families, and Classes

Felipe A. Lessa, Tainá Raiol, Marcelo M. Brigido, Daniele S. B. Martins Neto, Maria Emília M. T. Walter, Peter F. Stadler
2012 Genes  
In conclusion, a structure-based clustering can contribute to the elucidation of the relationships among the Rfam families beyond the realm of clans and classes.  ...  In the present work we investigate an alternative classification for the RNA families based on tree edit distance. The resulting clustering recovers some of the Rfam clans.  ...  Acknowledgments This work was supported in part by CAPES (to F.A.L. and T.R.), by the Deutsche Forschungsgemeinschaft (grant STA 850/7-1 within SPP-1258 to P.F.S.), and by CNPq and FINEP 01.08.0166.00  ... 
doi:10.3390/genes3030378 pmid:24704975 pmcid:PMC3899987 fatcat:feuw3rm5unhwtojkpzg7qnlsha

Berkeley Phylogenomics Group web servers: resources for structural phylogenomic analysis

J. G. Glanville, D. Kirshner, N. Krishnamurthy, K. Sjolander
2007 Nucleic Acids Research  
, FlowerPower clustering of proteins sharing the same domain architecture, MUSCLE multiple sequence alignment, SATCHMO simultaneous alignment and tree construction and SCI-PHY subfamily identification.  ...  The Berkeley Phylogenomics Group provides a series of web servers for phylogenomic analysis: classification of sequences to pre-computed families and subfamilies using the PhyloFacts Phylogenomic Encyclopedia  ...  ACKNOWLEDGEMENTS The authors wish to thank the numerous developers of bioinformatics web servers and databases providing methods or data included in the PhyloFacts resource and web servers.  ... 
doi:10.1093/nar/gkm325 pmid:17488835 pmcid:PMC1933202 fatcat:omacwxa4vvb5toaxahyhrolq3u

Getting Started in Structural Phylogenomics

Kimmen Sjölander, Olga Troyanskaya
2010 PLoS Computational Biology  
Acknowledgments We thank anonymous referees and Tandy Warnow for helpful comments.  ...  The SATCHMO (simultaneous alignment and tree construction using hidden Markov models) method addresses this issue using agglomerative clustering and profile-profile alignment to estimate a tree topology  ...  of a family of related sequences (i.e., a gene tree or multi-gene tree including gene duplication events) [2] [3] [4] .  ... 
doi:10.1371/journal.pcbi.1000621 pmid:20126522 pmcid:PMC2813252 fatcat:rjcpamo7ebbatkal3yrophvoeu

FlowerPower: clustering proteins into domain architecture classes for phylogenomic inference of protein function

Nandini Krishnamurthy, Duncan Brown, Kimmen Sjölander
2007 BMC Evolutionary Biology  
Results: We present FlowerPower, a novel clustering algorithm designed for the identification of global homologs as a precursor to structural phylogenomic analysis.  ...  Phylogenomic analysis reduces these errors by inferring protein function within the evolutionary context of the entire family.  ...  ), no undefined regions of >80 amino acids (i.e., a region with no PFAM match), and each PFAM domain was required to match a 3D structure classified by the SCOP database.  ... 
doi:10.1186/1471-2148-7-s1-s12 pmid:17288570 pmcid:PMC1796606 fatcat:eka635udezg5ji7lg4m25eyi4m
« Previous Showing results 1 — 15 out of 161,696 results