8 Hits in 0.85 sec

Compressing Inverted Index Using Optimal FastPFOR

Veluchamy Glory, Sandanam Domnic
2015 Journal of Information Processing  
Indexing plays an important role for storing and retrieving the data in Information Retrieval System (IRS). Inverted Index is the most frequently used indexing structure in IRS.  ...  The proposed method uses better integer representation and storage structure for compressing inverted index to improve the decompression performance.  ...  An inverted index contains two main parts: a lexicon file (dictionary) and inverted list (posting list).  ... 
doi:10.2197/ipsjjip.23.185 fatcat:avyovq5flbge5bcmgbhii5duxe

On Inverted Index Compression for Search Engine Efficiency [chapter]

Matteo Catena, Craig Macdonald, Iadh Ounis
2014 Lecture Notes in Computer Science  
Efficient access to the inverted index data structure is a key aspect for a search engine to achieve fast response times to users' queries.  ...  While the performance of an information retrieval (IR) system can be enhanced through the compression of its posting lists, there is little recent work in the literature that thoroughly compares and analyses  ...  Compression Techniques Inverted index compression has been common for some time.  ... 
doi:10.1007/978-3-319-06028-6_30 fatcat:4t5nkqcdvbhv7ape7wavhc6gja

Efficient searchable encryption through compression

Ioannis Demertzis, Rajdeep Talapatra, Charalampos Papamanthou
2018 Proceedings of the VLDB Endowment  
Our main idea is to utilize compression so as to reduce the size of the plaintext indexes before producing the encrypted searchable indices.  ...  In particular while ORAM is known to be prohibitively expensive for large-scale applications, we show that our compressfirst-ORAM-next approach allows significant more efficient index search time, reducing  ...  We thank James Kelley for useful suggestions.  ... 
doi:10.14778/3236187.3236218 fatcat:tpcfeovrerfsbox52o33gr2pku

Compact inverted index storage using general-purpose compression libraries

Matthias Petri, Alistair Moffat
2018 Software, Practice & Experience  
Here we re-examine mechanisms for representing document-level inverted indexes and withindocument term frequencies, including comparing specialized methods developed for this task against recent fast implementations  ...  Efficient storage of large inverted indexes is one of the key technologies that support current web search services.  ...  We thank the referees for their detailed feedback. Au  ... 
doi:10.1002/spe.2556 fatcat:ork5ksb225aafpkrgosp6dhn3i

A Comparison of Document-at-a-Time and Score-at-a-Time Query Evaluation

Matt Crane, J. Shane Culpepper, Jimmy Lin, Joel Mackenzie, Andrew Trotman
2017 Proceedings of the Tenth ACM International Conference on Web Search and Data Mining - WSDM '17  
Our work controls for score quantization, document processing, compression, implementation language, implementation effort, and a number of details, arriving at an empirical evaluation that fairly characterizes  ...  Although both strategies have been extensively explored, the literature lacks a fair, direct comparison: such a study has been difficult due to vastly different query evaluation mechanics and index organizations  ...  The key observation of the BMW algorithm is that postings in inverted indexes are compressed as blocks, and every i·th posting is left uncompressed in order to support skipping [26] .  ... 
doi:10.1145/3018661.3018726 dblp:conf/wsdm/CraneCLMT17 fatcat:reo2etfwwvhddaqbjirg4av4nq

Efficient Query Processing for Scalable Web Search

Nicola Tonellotto, Craig Macdonald, Iadh Ounis
2018 Foundations and Trends in Information Retrieval  
further growing the sizes of the search engines' indexes, or servicing growth in the user queries.  ...  Search engines are exceptionally important tools for accessing information in today's world.  ...  Acknowledgements We would like to thank Maarten de Rijke for his patience and encouragements during the preparation of this manuscript, as well as the three anonymous reviewers for their constructive suggestions  ... 
doi:10.1561/1500000057 fatcat:wx53qhvfhnfwfc4hgdva5ypw3u

Assessing Efficiency-Effectiveness Tradeoffs in Multi-Stage Retrieval Systems Without Using Relevance Judgments [article]

Charles L. A. Clarke, J. Shane Culpepper, Alistair Moffat
2015 arXiv   pre-print
Standard top-weighted metrics used for overall system evaluation are not appropriate for assessing filtering stages, since the output is a set of documents, rather than an ordered sequence of documents  ...  Since the quality score does not require relevance judgments, it can be used to identify queries that perform particularly poorly for a given filter.  ...  Postings lists are stored compressed using the FastPFOR library [14] , with skipping enabled.  ... 
arXiv:1506.00717v1 fatcat:v5k7hleeivhonl6lrmtscizqym

Optimizing Communication by Compression for Multi-GPU Scalable Breadth-First Searches [article]

Julian Romera
2017 arXiv   pre-print
This work presents an alternative compression scheme for communications in distributed BFS processing. It focuses on BFS processors using General-Purpose Graphics Processing Units.  ...  The importance of this algorithm increases each day due to it is a key requirement for many data structures which are becoming popular nowadays.  ...  Inverted Indexes The Inverted indexes are the most commonly used data-structures (among other options) to implement indexer systems [67] . They consist of two parts: 1.  ... 
arXiv:1704.00513v1 fatcat:bqyyropplbfebegzhkyludbmk4