Parallel query processing on distributed clustering indexes

Veronica Gil-Costa, Mauricio Marin, Nora Reyes
2009 Journal of Discrete Algorithms  
Similarity search has been proved suitable for searching in large collections of unstructured data objects. A number of practical index data structures for this purpose have been proposed. All of them have been devised to process single queries sequentially. However, in large-scale systems such as Web Search Engines indexing multi-media content, it is critical to deal efficiently with streams of queries rather than with single queries. In this paper we show how to achieve efficient and scalable
more » ... performance in this context. To this end we transform a sequential index based on clustering into a distributed one and devise algorithms and optimizations specially tailored to support high-performance parallel query processing.
doi:10.1016/j.jda.2008.09.010 fatcat:tyvgoy7ecrdehe626q6wjvdo4i