Bit-sliced signature files for very large text databases on a parallel machine architecture [chapter]

George Panagopoulos, Christos Faloutsos
1994 Lecture Notes in Computer Science  
Free text retrieval is an important problem which can signicantly bene t from a parallel architecture. Signature methods have been proposed to answer text retrieval queries in parallel machines Sta88, LF92], under the assumption that the main memory is su cient to hold the entire signature le. We propose the use of a Parallel Bit-Sliced Signature File method on a SIMD machine architecture when the size of the signature le exceeds the available memory. We propose that we need not examine all the
more » ... bit slices; instead we use a partial fetch slice swapping algorithm. This method achieves graceful performance degradation according to the database size. We provide formulae for the optimal number of signature slices to fetch and match with the query signature. Arithmetic examples show that our method can handle a 128GB database with a 2sec response time on a machine with the characteristics of the Connection Machine.
doi:10.1007/3-540-57818-8_65 fatcat:l2ljjnllzjdpfol3gfbqkl5z5e