Efficient top-k retrieval with signatures

Timothy Chappell, Shlomo Geva, Anthony Nguyen, Guido Zuccon
2013 Proceedings of the 18th Australasian Document Computing Symposium on - ADCS '13  
This paper describes a new method of indexing and searching large binary signature collections to efficiently find similar signatures, addressing the scalability problem in signature search. Signatures offer efficient computation with acceptable measure of similarity in numerous applications. However, performing a complete search with a given search argument (a signature) requires a Hamming distance calculation against every signature in the collection. This quickly becomes excessive when
more » ... g with large collections, presenting issues of scalability that limit their applicability. Our method efficiently finds similar signatures in very large collections, trading memory use and precision for greatly improved search speed. Experimental results demonstrate that our approach is capable of finding a set of nearest signatures to a given search argument with a high degree of speed and fidelity.
doi:10.1145/2537734.2537742 dblp:conf/adcs/ChappellGNZ13 fatcat:igt5rgn3drhp5et4s7jqlfblcy