The Internet Archive has a preservation copy of this work in our general collections.
The file type is application/pdf
.
Balancing clusters to reduce response time variability in large scale image search
[article]
2010
arXiv
pre-print
Many algorithms for approximate nearest neighbor search in high-dimensional spaces partition the data into clusters. At query time, in order to avoid exhaustive search, an index selects the few (or a single) clusters nearest to the query point. Clusters are often produced by the well-known k-means approach since it has several desirable properties. On the downside, it tends to produce clusters having quite different cardinalities. Imbalanced clusters negatively impact both the variance and the
arXiv:1009.4739v1
fatcat:hdod6pwlgbbwnpwhfvhiel45vm