Angle Tree: Nearest Neighbor Search in High Dimensions with Low Intrinsic Dimensionality [article]

Ilia Zvedeniouk, Sanjay Chawla
2010 arXiv   pre-print
We propose an extension of tree-based space-partitioning indexing structures for data with low intrinsic dimensionality embedded in a high dimensional space. We call this extension an Angle Tree. Our extension can be applied to both classical kd-trees as well as the more recent rp-trees. The key idea of our approach is to store the angle (the "dihedral angle") between the data region (which is a low dimensional manifold) and the random hyperplane that splits the region (the "splitter"). We show
more » ... that the dihedral angle can be used to obtain a tight lower bound on the distance between the query point and any point on the opposite side of the splitter. This in turn can be used to efficiently prune the search space. We introduce a novel randomized strategy to efficiently calculate the dihedral angle with a high degree of accuracy. Experiments and analysis on real and synthetic data sets shows that the Angle Tree is the most efficient known indexing structure for nearest neighbor queries in terms of preprocessing and space usage while achieving high accuracy and fast search time.
arXiv:1003.5474v2 fatcat:iwmrw72ayndadk6ke3c2vpjlxi