A Space-Efficient Inverted Index Technique using Data Rearrangement for String Similarity Searches
유사도 검색을 위한 데이터 재배열을 이용한 공간 효율적인 역 색인 기법

Manu Im, Jongik Kim
2015 Journal of KIISE  
An inverted index structure is widely used for efficient string similarity search. One of the main requirements of similarity search is a fast response time; to this end, most techniques use an in-memory index structure. Since the size of an inverted index structure usually very large, however, it is not practical to assume that an index structure will fit into the main memory. To alleviate this problem, we propose a novel technique that reduces the size of an inverted index. In order to reduce
more » ... the size of an index, the proposed technique rearranges data strings so that the data strings containing the same q-grams can be placed close to one other. Then, the technique encodes those multiple strings into a range. Through an experimental study using real data sets, we show that our technique significantly reduces the size of an inverted index without sacrificing query processing time.
doi:10.5626/jok.2015.42.10.1247 fatcat:lftmj6e2mzejvdchqept57cxqy