Filters








1,462 Hits in 5.5 sec

Hybrid index maintenance for growing text collections

Stefan Büttcher, Charles L. A. Clarke, Brad Lushman
2006 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06  
These new strategies improve upon recent results for hybrid index maintenance in dynamic text retrieval systems.  ...  We present a new family of hybrid index maintenance strategies to be used in on-line index construction for monotonically growing text collections.  ...  For the sake of brevity, we only discuss two major problems: Figure 3 : Index layout for a hybrid maintenance strategy with non-contiguous posting lists.  ... 
doi:10.1145/1148170.1148233 dblp:conf/sigir/ButtcherCL06 fatcat:o7s5mofubzg5rkdle65wnm337q

A Hybrid Approach to Index Maintenance in Dynamic Text Retrieval Systems [chapter]

Stefan Büttcher, Charles L. A. Clarke
2006 Lecture Notes in Computer Science  
In-place and merge-based index maintenance are the two main competing strategies for on-line index construction in dynamic information retrieval systems based on inverted lists.  ...  Motivated by recent results for both strategies, we investigate possible combinations of in-place and merge-based index maintenance.  ...  We propose a hybrid approach to index maintenance, based on the idea that for long lists it takes more time to copy the whole list than to perform a single disk seek operation, while for short lists a  ... 
doi:10.1007/11735106_21 fatcat:q3k6l2fprrbgladxxzxnhbcuey

Incremental Text Indexing for Fast Disk-Based Search

Giorgos Margaritis, Stergios V. Anastasiadis
2014 ACM Transactions on the Web  
Instead, disk-based methods for incremental index maintenance substantially increase search latency with the index fragmented across multiple disk locations.  ...  Incremental text indexing for fast disk-based search.  ...  They would also like to thank Argyris Kalogeratos for his detailed feedback on an earlier draft.  ... 
doi:10.1145/2560800 fatcat:bdwpapill5g6lawoih67ojljaa

Low-cost management of inverted files for online full-text search

Giorgos Margaritis, Stergios V. Anastasiadis
2009 Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09  
Several recent methods that support fast incremental indexing of documents typically keep on disk multiple partial index structures that they continuously update as new documents are added.  ...  However, spreading indexing information across multiple locations on disk tends to considerably decrease the search responsiveness of the system.  ...  In the present paper, we radically simplify online index maintenance by keeping the posting lists on fixed-size blocks rather than contiguous files.  ... 
doi:10.1145/1645953.1646012 dblp:conf/cikm/MargaritisA09 fatcat:qtyfmzdcvnawnm33ksxvsej7wm

Efficient online index construction for text databases

Nicholas Lester, Alistair Moffat, Justin Zobel
2008 ACM Transactions on Database Systems  
The Search Engine Group at RMIT also deserve my gratitude for the friendship and occasionally studious atmosphere they provided. I would like to particularly thank Simon  ...  iii Declaration I certify that except where due acknowledgement has been made, the work is that of the author alone; the work has not been submitted previously, in whole or in part, to qualify for any  ...  inverted lists.  ... 
doi:10.1145/1386118.1386125 fatcat:3dsa7vy4wfd5la2cgqui5cecce

Efficient Online Index Maintenance for SSD-based Information Retrieval Systems

Ruixuan Li, Xuefan Chen, Chengzhou Li, Xiwu Gu, Kunmei Wen
2012 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems  
In this paper, we propose Hybrid Merge, a new online index maintenance strategy for information retrieval systems, which applies SSDs instead of hard disk drives (HDDs) to store inverted indexes.  ...  However, they have a very unique characteristic of erase-before-write, which probably makes existing index maintenance methods inapplicable to SSDs.  ...  A HYBRID MERGE STRATEGY FOR SSD-BASED INDEX MAINTENANCE We now present our solution to SSD-based Indexing.  ... 
doi:10.1109/hpcc.2012.43 dblp:conf/hpcc/LiCLGW12 fatcat:o7eojqjo5fdavlqutgcuyvhvze

Answering approximate string queries on large data sets using external memory

Alexander Behm, Chen Li, Michael J. Carey
2011 2011 IEEE 27th International Conference on Data Engineering  
We devise a novel physical layout for an inverted index to answer queries and we study how to construct it with limited buffer space.  ...  To answer queries, we develop a cost-based, adaptive algorithm that balances the I/O costs of retrieving candidate matches and accessing inverted lists.  ...  Index Maintenance Inverted Index: Standard techniques [14] for invertedindex maintenance can be used to update our partitioned inverted index.  ... 
doi:10.1109/icde.2011.5767856 dblp:conf/icde/BehmLC11 fatcat:yvlhae25d5abxaea2kzffiirye

On-line index maintenance using horizontal partitioning

Sairam Gurajada, Sreenivasa Kumar P
2009 Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09  
In this paper, we propose a new merge-based index maintenance strategy for Information Retrieval systems. The new model is based on partitioning of the inverted index across the terms in it.  ...  Inverted lists of the terms contained in the queries that are frequently posed to the Information Retrieval systems are kept in one partition, called frequent-term index and the other inverted lists form  ...  In this regard, off-line index provides a very high query performance by building a single large inverted index containing contiguous postings list for each term, but at the cost of having to build the  ... 
doi:10.1145/1645953.1646010 dblp:conf/cikm/GurajadaK09 fatcat:ry45w4n7erccfirnbsfng2c7ha

Efficient answering of set containment queries for skewed item distributions

Manolis Terrovitis, Panagiotis Bouros, Panos Vassiliadis, Timos Sellis, Nikos Mamoulis
2011 Proceedings of the 14th International Conference on Extending Database Technology - EDBT/ICDT '11  
We propose a novel indexing scheme, the Ordered Inverted File (OIF) which, differently from the state-of-the-art, indexes setvalued attributes in an ordered fashion.  ...  OIF is simple to implement and our experiments on both real and synthetic data show that it greatly outperforms the current state-of-the-art methods for all three classes of containment queries.  ...  The Hybrid Trie Inverted file (HTI) [42] , breaks up the larger inverted lists to smaller sublists that contain known combinations of items.  ... 
doi:10.1145/1951365.1951394 dblp:conf/edbt/TerrovitisBVSM11 fatcat:tu7w3jyqpzalpdsibiskhes5yq

Scalable top-k spatial keyword search

Dongxiang Zhang, Kian-Lee Tan, Anthony K. H. Tung
2013 Proceedings of the 16th International Conference on Extending Database Technology - EDBT '13  
Various hybrid indexes have been proposed in recent years which mainly combine the R-tree and the inverted index so that spatial pruning and textual pruning can be executed simultaneously.  ...  However, the rapid growth in data volume poses significant challenges to existing methods in terms of the index maintenance cost and query processing time.  ...  Existing solutions prefer the combination of R-tree and inverted list. However, those hybrid indexes are not scalable and require high maintenance cost.  ... 
doi:10.1145/2452376.2452419 dblp:conf/edbt/ZhangTT13 fatcat:i6myhkydkjcgziquw6yt7iwj5y

Inverted files for text search engines

Justin Zobel, Alistair Moffat
2006 ACM Computing Surveys  
The development of a family of new index representations has led to a wide range of innovations in index storage, index construction, and query evaluation.  ...  We conclude with a comprehensive bibliography of text indexing literature.  ...  ACKNOWLEDGMENTS We thank the many collaborators, assistants, and students who have contributed to our work on indexing.  ... 
doi:10.1145/1132956.1132959 fatcat:u56re4tqtfg6zcpyfnzl5ne57m

Fast construction of the HYB index

Hannah Bast, Marjan Celikik
2011 ACM Transactions on Information Systems  
As shown in a series of recent works, the HYB index is an alternative to the inverted index (INV) that enables very fast prefix searches, which in turn is the basis for fast processing of many other types  ...  Finally, we show that HYB supports fast dynamic index updates more easily than INV.  ...  ACKNOWLEDGMENTS We are grateful to the anonymous referees for their outstandingly painstaking, competent, and constructive comments.  ... 
doi:10.1145/1993036.1993040 fatcat:o7sw27ptbvb6ffvt7m2xiwfs5m

Searching Web Data: An Entity Retrieval and High-Performance Indexing Model

Renaud Delbru, Stephane Campinas, Giovanni Tummarello
2012 Social Science Research Network  
We introduce an indexing methodology for semi-structured data which offers a good compromise between query expressiveness, query processing and index maintenance compared to other approaches.  ...  Finally, we demonstrate that the resulting system can index billions of data objects and provides keyword-based as well as more advanced search interfaces for retrieving relevant data objects in sub-second  ...  Each inverted file stores contiguously one type of list, and five pointers are associated to each term in the lexicon, one pointer to the beginning of the inverted list in each inverted file.  ... 
doi:10.2139/ssrn.3198931 fatcat:mimvtyqbkbhaniijpxrl3wqu2i

Searching web data: An entity retrieval and high-performance indexing model

Renaud Delbru, Stephane Campinas, Giovanni Tummarello
2012 Journal of Web Semantics  
We introduce an indexing methodology for semi-structured data which offers a good compromise between query expressiveness, query processing and index maintenance compared to other approaches.  ...  Finally, we demonstrate that the resulting system can index billions of data objects and provides keyword-based as well as more advanced search interfaces for retrieving relevant data objects in sub-second  ...  Each inverted file stores contiguously one type of list, and five pointers are associated to each term in the lexicon, one pointer to the beginning of the inverted list in each inverted file.  ... 
doi:10.1016/j.websem.2011.04.004 fatcat:ts4tyui34nf7ldub2l7tcavdpy

Parallel computing in information retrieval – an updated review

A. Macfarlane, S.E. Robertson, J.A. Mccann
1997 Journal of Documentation  
machine architectures which have been used for parallel IR systems.  ...  The DAP [7] organisation is an array of 1-bit processing elements (PEs) arranged in a 32 by 32 matrix for the 500 series and 64 by 64 for the 600 series; 1024 and 4096 PE's in total respectively.  ...  ACKNOWLEDGEMENTS This research is supported by the Department for Education and Employment, grant number IS96/4197.  ... 
doi:10.1108/eum0000000007201 fatcat:2zuwtehixbd6xk33hwb3j43nse
« Previous Showing results 1 — 15 out of 1,462 results