Filters








7,688 Hits in 5.3 sec

On the Approximability of Numerical Taxonomy (Fitting Distances by Tree Metrics)

Richa Agarwala, Vineet Bafna, Martin Farach, Mike Paterson, Mikkel Thorup
1998 SIAM journal on computing (Print)  
We consider the problem of fitting an n × n distance matrix D by a tree metric T . Let ε be the distance to the closest tree metric under the L∞ norm; that is, ε = min T { T − D ∞}.  ...  First we present an O(n 2 ) algorithm for finding a tree metric T such that T − D ∞ ≤ 3ε. Second we show that it is N P-hard to find a tree metric T such that T − D ∞ < 9 8 ε.  ...  One of the most common methods for clustering numeric data involves fitting the data to a tree metric, which is defined by a weighted tree spanning the points of the metric, the distance between two points  ... 
doi:10.1137/s0097539795296334 fatcat:xcdfmrgia5blxkgczaxt4cxyam

Fitting Distances by Tree Metrics Minimizing the Total Error within a Constant Factor [article]

Vincent Cohen-Addad, Debarati Das, Evangelos Kipouridis, Nikos Parotsidis, Mikkel Thorup
2021 arXiv   pre-print
We consider the numerical taxonomy problem of fitting a positive distance function D:S 2→ℝ_>0 by a tree metric.  ...  We can do this both for general trees, and for the special case of ultrametrics with a root having the same distance to all vertices in S.  ...  The best previous approximation factor was O((log n)(log log n)) by who wrote "Determining whether an O(1) approximation can be obtained is a fascinating question".  ... 
arXiv:2110.02807v1 fatcat:sxz5iov6xzaotmimokwvdjon6i

Approximating Additive Distortion of Embeddings into Line Metrics [chapter]

Kedar Dhamdhere
2004 Lecture Notes in Computer Science  
We consider the problem of fitting metric data on n points to a path (line) metric. Our objective is to minimize the total additive distortion of this mapping.  ...  The total additive distortion is the sum of errors in all pairwise distances in the input data. This problem has been shown to be NP-hard by [13] .  ...  Introduction One of the most common methods for clustering numerical data is to fit the data to tree metrics. A tree metric is defined on vertices of a weighted tree.  ... 
doi:10.1007/978-3-540-27821-4_9 fatcat:3b7ofxe36neobe5kabuqxq3oru

Page 2049 of Mathematical Reviews Vol. , Issue 97C [page]

1997 Mathematical Reviews  
the approximability of numerical taxonomy (fitting distances by tree metrics).  ...  Summary: “We consider the problem of fitting an n x n distance matrix D by a tree metric 7. Let € be the distance to the closest tree metric under the L.. norm, that is, e = miny{||7, D||,..}.  ... 

Improving the Reliability of Decision-Support Systems for Nuclear Emergency Management by Leveraging Software Design Diversity
english

Tudor B. Ionescu, Walter Scheuermann
2016 Journal of Computing and Information Technology  
The acceptance test and the voter are used in a new scheme, which extends the Consensus Recovery Block method by a database of result taxonomies to support machine-learning.  ...  and the trustworthiness of the simulation results used by emergency managers in the decision making process.  ...  Figure 3 . 3 Taxonomic trees corresponding to the three dispersion codes implemented in RODOS. The trees were constructed by means of the GRSS distance with p = 0.6.  ... 
doi:10.20532/cit.2016.1002700 fatcat:6ysnaecvljfrzpzpfolnmhnrfe

Page 7336 of Mathematical Reviews Vol. , Issue 2000j [page]

2000 Mathematical Reviews  
[Farach-Colton, Martin] (1-RTG-C; Piscataway, NJ); Paterson, Mike (4-WARW-C; Coventry); Thorup, Mikkel (DK-CPNH-CS; Copenhagen) On the approximability of numerical taxonomy (fitting distances by tree  ...  Summary: “We consider the problem of fitting an n x n distance matrix D by a tree metric T. Let € be the distance to the closest tree metric under the L,, norm; that is, e = miny{||T — D||.}.  ... 

l∞-Approximation via Subdominants

Victor Chepoi, Bernard Fichet
2000 Journal of Mathematical Psychology  
algorithm for the problem of fitting a distance by a tree metric).  ...  This leads to simple optimal algorithms for the problem of best l -fitting of distances by ultrametrics and by tree metrics preserving the distances to a fixed vertex (the latter provides a 3-approximation  ...  In numerical taxonomy, u is a distance (more generally, a dissimilarity) on a finite set X and K is the cone of all ultrametrics or tree metrics defined on X; see Barthe lemy and Gue noche (1991) and  ... 
doi:10.1006/jmps.1999.1270 pmid:11133300 fatcat:mdlbjucod5c5no4w2yfpyistju

Alignment-Free Genome Tree Inference by Learning Group-Specific Distance Metrics

Kaustubh R. Patil, Alice C. McHardy
2013 Genome Biology and Evolution  
We propose a method to improve genome tree inference by learning specific distance metrics over the genome signature for groups of organisms with similar phylogenetic, genomic, or ecological properties  ...  By applying this method to more than a thousand prokaryotic genomes, we showed that, indeed, better distance metrics could be learned for most of the 18 groups of organisms tested here.  ...  Acknowledgments The authors thank Lars Steinbrü ck (MPI Informatik, HHU) for critical reading of the manuscript and useful comments.  ... 
doi:10.1093/gbe/evt105 pmid:23843191 pmcid:PMC3762195 fatcat:5ow74suvefbxvgthiyzna3rpji

Fitting Tree Metrics: Hierarchical Clustering and Phylogeny

Nir Ailon, Moses Charikar
2011 SIAM journal on computing (Print)  
Partially supported by a Charlotte Elizabeth Procter Fellowship.  ...  The problem of fitting tree metrics also arises in phylogeny where the objective is to learn the evolution tree by fitting a tree to dissimilarity data on taxa.  ...  The quality of the fit is measured by taking the p norm of the difference between the tree metric constructed and the given data.  ... 
doi:10.1137/100806886 fatcat:6w6nnelhene3pedbpl4zx7c2wa

Seeded Hierarchical Clustering for Expert-Crafted Taxonomies [article]

Anish Saha, Amith Ananthram, Emily Allaway, Heng Ji, Kathleen McKeown
2022 arXiv   pre-print
In this work, we study Seeded Hierarchical Clustering (SHC): the task of automatically fitting unlabeled data to such taxonomies using only a small set of labeled examples.  ...  It outperforms both unsupervised and supervised baselines for the SHC task on three real-world datasets.  ...  The magnitude of c (i) other (i.e., c (i) other ) is approximated to be the magnitude of the centroid of the subtopics. Its representation is given by Equation 2 (see Appendix A for derivation).  ... 
arXiv:2205.11602v1 fatcat:jmdwans4jnejlp5rxpwlyd3dbe

Page 6223 of Mathematical Reviews Vol. , Issue 96j [page]

1996 Mathematical Reviews  
Farach, Babu Narayanan, Mike Paterson and Mikkel Thorup, On the approximability of numerical taxonomy (fitting distances by tree metrics) (365-372); Paolo Ferragina and Roberto Grossi, Fast string searching  ...  Proceedings of the conference held at the University of Leeds, Leeds, September 1993. Edited by D. M. Titterington. The Institute of Mathematics and its Applications cuaieines Series. New Series, 54.  ... 

Many-to-Many Feature Matching Using Spherical Coding of Directed Graphs [chapter]

M. Fatih Demirci, Ali Shokoufandeh, Sven Dickinson, Yakov Keselman, Lars Bretzner
2004 Lecture Notes in Computer Science  
The algorithm was based on a metric-tree representation of labeled graphs and their metric embedding into normed vector spaces, using the embedding algorithm of Matousek [13] .  ...  This reduces the problem of directed graph matching to the problem of geometric point matching, for which efficient many-to-many matching algorithms exist, such as the Earth Mover's Distance.  ...  The work of Yakov Keselman is supported, in part, by the NSF grant No. 0125068. Sven Dickinson acknowledges the support of NSERC, CITO, IRIS, PREA, and the NSF.  ... 
doi:10.1007/978-3-540-24670-1_25 fatcat:burtx4motjfefl2bnkucsgw4pu

Tumor classification using phylogenetic methods on expression data

Richard Desper, Javed Khan, Alejandro A. Schäffer
2004 Journal of Theoretical Biology  
To solve the class discovery problem, we impose a metric on a set of tumors as a function of their gene expression levels, and impose a tree structure on this metric, using standard tree fitting methods  ...  To solve the class prediction problem, we built a classification tree on the learning set, and then sought the optimal placement of each test sample within the classification tree.  ...  Also, we thank three anonymous referees for many helpful suggestions that led to substantial improvements in the manuscript.  ... 
doi:10.1016/j.jtbi.2004.02.021 pmid:15178197 fatcat:7urg6jw34bfopkopuu2bwgybtq

Statistical Object Data Analysis of Taxonomic Trees from Human Microbiome Data

Patricio S. La Rosa, Berkley Shands, Elena Deych, Yanjiao Zhou, Erica Sodergren, George Weinstock, William D. Shannon, Chuhsing Kate Hsiao
2012 PLoS ONE  
The contribution of our work is threefold: first, a weighted tree structure to analyze RDP data is introduced; second, using a probability measure to model a set of taxonomic trees, we introduce an approximate  ...  The data objects that pertain to this work are taxonomic trees of bacteria built from analysis of 16S rRNA gene sequences (e.g. using RDP); there is one such object for each biological sample analyzed.  ...  R z an arbitrary metric of distance on G.  ... 
doi:10.1371/journal.pone.0048996 pmid:23152838 pmcid:PMC3494672 fatcat:r7ccpwelrvgzvlqclloecifcri

Automatic Taxonomy Construction from Keywords via Scalable Bayesian Rose Trees

Yangqiu Song, Shixia Liu, Xueqing Liu, Haixun Wang
2015 IEEE Transactions on Knowledge and Data Engineering  
, the domain of interest is already represented by a set of keywords.  ...  We reduce the complexity of previous hierarchical clustering approaches from O(n 2 log n) to O(n log n) using a nearest-neighbor-based approximation, so that we can derive a domain-specific taxonomy from  ...  ACKNOWLEDGEMENTS We would like to thank Charles Blundell and Yee Whye Teh for their help on the implementation of the Bayesian rose tree and thanks to Ting Liu for help on the implementation of Spilltree  ... 
doi:10.1109/tkde.2015.2397432 fatcat:qcxwfsed3vgwroyklu6h5ip2ma
« Previous Showing results 1 — 15 out of 7,688 results