Filters








677 Hits in 4.2 sec

Incremental cluster-based retrieval using compressed cluster-skipping inverted files

Ismail Sengor Altingovde, Engin Demir, Fazli Can, Özgür Ulusoy
2008 ACM Transactions on Information Systems  
We propose a unique cluster-based retrieval (CBR) strategy using a new cluster-skipping inverted file for improving query processing efficiency.  ...  Our experiments with various collections show that the incremental-CBR strategy using a compressed cluster-skipping inverted file significantly improves CPU time efficiency, regardless of query length.  ...  Incremental Cluster-Based Retrieval In incremental CBR, we determine the best clusters by only accessing the cluster-skipping IIS.  ... 
doi:10.1145/1361684.1361688 fatcat:3iuznbyiobdzpp3m4ih7qvquuu

Efficiency and effectiveness of query processing in cluster-based retrieval

Fazli Can, Ismail Sengör Altingövde, Engin Demir
2004 Information Systems  
Our research shows that for large databases, without considerable additional storage overhead, cluster-based retrieval (CBR) can compete with the time efficiency and effectiveness of the inverted index-based  ...  The proposed CBR method employs a storage structure that blends the cluster membership information into the inverted file posting lists.  ...  Acknowledgements We appreciate the comments made by a referee; they help us improve the presentation.  ... 
doi:10.1016/s0306-4379(03)00062-0 fatcat:bn5xke6a7fbu7n4wrflh2ubczy

Site-based dynamic pruning for query processing in search engines

Ismail Sengor Altingovde, Engin Demir, Fazli Can, Özgür Ulusoy
2008 Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08  
Web search engines typically index and retrieve at the page level.  ...  The resulting typical and cluster-skipping inverted files take 6.3 GB and 6.6 GB (uncompressed) and 767 MB and 785 MB (compressed), respectively.  ...  Next, the typical inverted index and CS-IIS are constructed. Both files are compressed using the same procedures as described in [1] .  ... 
doi:10.1145/1390334.1390543 dblp:conf/sigir/AltingovdeDCU08 fatcat:cjbhl57bc5at5bnzedcrrulkzy

Cluster searching strategies for collaborative recommendation systems

Ismail Sengor Altingovde, Özlem Nurcan Subakan, Özgür Ulusoy
2013 Information Processing & Management  
We provide an efficient implementation of this strategy by adapting a specifically tailored cluster-skipping inverted index structure.  ...  Cluster-based collaborative filtering techniques can be a remedy for the efficiency problem, but they usually provide relatively lower accuracy figures, since they may become over-generalized and produce  ...  CF with cluster-skipping inverted index and using individualistic strategy.  ... 
doi:10.1016/j.ipm.2012.07.008 fatcat:7s66osvnmfgthoktu6pn2bqjfi

Inverted files for text search engines

Justin Zobel, Alistair Moffat
2006 ACM Computing Surveys  
Jamie Callan, Bruce Croft, Donna Harman, Mike Lesk, and Ellen Voorhees helped us identify some of the early work in the area.  ...  Can [1994] proposed a combination of cluster search and inverted file search, arguing that it is advantageous to organize inverted list entries by cluster, but interest in cluster-based searching has  ...  Lee et al. [1996] compare inverted files and signature files for keyword-based retrieval of structured documents.  ... 
doi:10.1145/1132956.1132959 fatcat:u56re4tqtfg6zcpyfnzl5ne57m

Exploiting Available Memory and Disk for Scalable Instant Overview Search [chapter]

Pavlos Fafalios, Yannis Tzitzikas
2011 Lecture Notes in Computer Science  
Furthermore we show that an incremental algorithm can be used to keep the index structure fresh. 1  ...  techniques, or metadata-based groupings of the results.  ...  , (2) it's suggestion system is word based, not query based, i.e. it suggests only words that match user's current input, not whole queries, (3) it focuses on compression of index structures, especially  ... 
doi:10.1007/978-3-642-24434-6_8 fatcat:nxjpngnlrncdxhchisvhw6hbou

An effective region-based image retrieval framework

Feng Jing, Mingjing Li, Hong-Jiang Zhang, Bo Zhang
2002 Proceedings of the tenth ACM international conference on Multimedia - MULTIMEDIA '02  
The framework consists of methods for image segmentation and grouping, indexing using modified inverted file, relevance feedback, and continuous learning.  ...  Based on this representation, an indexing scheme similar to the inverted file technology is proposed. In addition, it supports relevance feedback based on the vector model with a weighting scheme.  ...  IF denotes indexing using traditional inverted files, while MIF denotes our modified inverted file strategy described in Section 3.2.  ... 
doi:10.1145/641103.641106 fatcat:4cwizvvwejafrofxsp4zroitcq

An effective region-based image retrieval framework

Feng Jing, Mingjing Li, Hong-Jiang Zhang, Bo Zhang
2002 Proceedings of the tenth ACM international conference on Multimedia - MULTIMEDIA '02  
The framework consists of methods for image segmentation and grouping, indexing using modified inverted file, relevance feedback, and continuous learning.  ...  Based on this representation, an indexing scheme similar to the inverted file technology is proposed. In addition, it supports relevance feedback based on the vector model with a weighting scheme.  ...  IF denotes indexing using traditional inverted files, while MIF denotes our modified inverted file strategy described in Section 3.2.  ... 
doi:10.1145/641007.641106 dblp:conf/mm/JingLZZ02 fatcat:j3wir4lciza6rk7hdhrog3ldhi

Fast evaluation of structured queries for information retrieval

Eric W. Brown
1995 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '95  
We present a new structured query optimization technique which we have implemented in an inference network-based information retrieval system.  ...  In an effort to provide better retrieval performance on large collections, more sophisticated retrieval techniques have been developed that support rich, structured queries.  ...  Acknowledgements Thanks go to Jamie CalIan for his helpful comments on early drafts of this paper, Bruce Croft for his guidance throughout this work, and the staff and students of CIIR for their work on the systems used  ... 
doi:10.1145/215206.215329 dblp:conf/sigir/Brown95 fatcat:u4wgjdios5chnbjmy655n3uy3i

Visual Vocabulary Learning and Its Application to 3D and Mobile Visual Search [article]

Liujuan Cao
2012 arXiv   pre-print
In this technical report, we review related works and recent trends in visual vocabulary based web image search, object recognition, mobile visual search, and 3D object retrieval.  ...  Especial focuses would be also given for the recent trends in supervised/unsupervised vocabulary optimization, compact descriptor for visual search, as well as in multi-view based 3D object representation  ...  ); • Quantizing these local features into a Bag-of-Words histogram using the vocabulary; • Ranking similar images in the inverted indexing files of all non-empty words, so as to avoid the linear scanning  ... 
arXiv:1207.7244v1 fatcat:b6y7yvvcu5davkwh3zuv762qq4

Hybrid Indexing for Versioned Document Search with Cluster-based Retrieval

Xin Jin, Daniel Agun, Tao Yang, Qinghao Wu, Yifan Shen, Susen Zhao
2016 Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16  
This paper proposes an alternative approach that uses cluster-based retrieval to quickly narrow the search scope guided by version representatives at Phase 1 and develops a hybrid index structure with  ...  The second phase then re-ranks top document versions using positional information with fragment-based index compression.  ...  Document ranking using cluster-based retrieval and language models is studied by Liu and Croft [30] , and cluster ranking is addressed by Kurland and Krikon [28] .  ... 
doi:10.1145/2983323.2983733 dblp:conf/cikm/JinAYWSZ16 fatcat:qpws7znec5a3hj2psxaj262bvu

Efficient query processing in distributed search engines

Simon Jonassen
2012 SIGIR Forum  
We present an efficient self-skipping inverted index designed for modern index compression methods and several query processing optimizations.  ...  We carefully evaluate our ideas either using a real implementation or by simulation using real-world text collections and query logs.  ...  Rocha-Junior for the useful advices and comments on the paper. Acknowledgments. This work was done while the second author was an intern at Yahoo!  ... 
doi:10.1145/2492189.2492201 fatcat:uwasxhngrfgntemkhawyv3te64

Very Large Scale Information Retrieval [chapter]

David Hawking
2003 Lecture Notes in Computer Science  
This chapter is based on a series of five lectures presented at the EL-SNET TesTia Summer School held in Chios, Greece in July, 2000.  ...  The final result is the document table and inverted file as shown in Figure 9 . Processing Queries Using an Inverted File.  ...  Note that the position lists can be compressed even though the skip records consume some additional information.  ... 
doi:10.1007/978-3-540-45115-0_5 fatcat:htpclfiodvgvpesebdfmwf2m5m

Structured Index Organizations for High-Throughput Text Querying [chapter]

Vo Ngoc Anh, Alistair Moffat
2006 Lecture Notes in Computer Science  
Inverted indexes are the preferred mechanism for supporting contentbased queries in text retrieval systems, with the various data items usually stored compressed in some way.  ...  In this study we describe an inverted index organization that provides efficient support for all of conjunctive Boolean queries, ranked queries, and phrase queries.  ...  Introduction Inverted indexes are the preferred mechanism for supporting content-based queries in text retrieval systems, with the various data items usually stored compressed in some way [Witten et al  ... 
doi:10.1007/11880561_25 fatcat:qo7iilcbvffercbts3dwmu7fii

Scalable, flexible and generic instant overview search

Pavlos Fafalios, Ioannis Kitsos, Yannis Tzitzikas
2012 Proceedings of the 21st international conference companion on World Wide Web - WWW '12 Companion  
We demonstrate how this can be achieved using very modest hardware.  ...  Our approach relies on (a) a partitioned trie-based index that exploits the available main memory and disk, and (b) dedicated caching techniques.  ...  For the top suggestion, we access the corresponding random access file and retrieve its first page of results and its cluster label tree.  ... 
doi:10.1145/2187980.2188042 dblp:conf/www/FafaliosKT12 fatcat:75j6mfh4ljepbbbw46g276xnwi
« Previous Showing results 1 — 15 out of 677 results