Smaller, Faster & Lighter KNN Graph Constructions

Rachid Guerraoui, Anne-Marie Kermarrec, Olivier Ruas, François Taïani
2020 Proceedings of The Web Conference 2020  
We propose GoldFinger, a new compact and fast-to-compute binary representation of datasets to approximate Jaccard's index. We illustrate the effectiveness of GoldFinger on the emblematic big data problem of K-Nearest-Neighbor (KNN) graph construction and show that GoldFinger can drastically accelerate a large range of existing KNN algorithms with little to no overhead. As a side effect, we also show that the compact representation of the data protects users' privacy for free by providing
more » ... mity and l-diversity. Our extensive evaluation of the resulting approach on several realistic datasets shows that our approach delivers speedups of up to 78.9% compared to the use of raw data while only incurring a negligible to moderate loss in terms of KNN quality. To convey the practical value of such a scheme, we apply it to item recommendation and show that the loss in recommendation quality is negligible.
doi:10.1145/3366423.3380184 dblp:conf/www/GuerraouiKRT20 fatcat:spljzu2qybfbdcqmlnwv4kphba