Filters








11,383 Hits in 4.8 sec

ImprovingWeb Searches with Distributed Buckets Structures

V. Costa, A. Printista, M. Marin
2006 2006 Fourth Latin American Web Congress  
We use the inverted files as the data structure and the vector space model to perform the ranking of documents.  ...  The design of the server that processes the queries, is effected on top of the Bulk Synchronous-BSP model of parallel computing, to study how query performance is affected by the index organization.  ...  In the algorithms implemented, the vector space model is adopted as the ranking strategy, and the inverted files are used as index structures.  ... 
doi:10.1109/la-web.2006.18 dblp:conf/la-web/CostaPM06 fatcat:xokhjcsybfcgxjfup6dbuh6zmy

Optimization of inverted vector searches

Chris Buckley, Alan F. Lewit
1985 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '85  
A simple algorithm is presented for increasing the efficiency of information retrieval searches which are implemented using inverted files.  ...  This optimization algorithm employs knowledge about the methods used for weighting document and query terms in order to ezamine as few inverted lists as possible.  ...  This approach can be contrasted to that proposed by Harper[12] who also used an inverted file implementation of the inner product similarity function {although with binary document weights).  ... 
doi:10.1145/253495.253515 dblp:conf/sigir/BuckleyL85 fatcat:k3afei7rqrextce6md2mnik3cy

Memory efficient ranking

Alistair Moffat, Justin Zobel, Ron Sacks-Davis
1994 Information Processing & Management  
an array of partial similarity accumulators, and address tables for inverted file entries and documents.  ...  Fast and effective ranking of a collection of documents with respect to a query requires several structures, including a vocabulary, inverted file entries, arrays of term weights and document lengths,  ...  Acknowledgements We would like to thank James Thom for a number of helpful suggestions, and Neil Sharman for his assistance with the implementation.  ... 
doi:10.1016/0306-4573(94)90002-7 fatcat:vigacpy7kvh6hkhggdae43anhu

Filtered document retrieval with frequency-sorted indexes

Michael Persin, Justin Zobel, Ron Sacks-Davis
1996 Journal of the American Society for Information Science  
We propose an evaluation technique that uses early recognition of which documents are likely to be highly ranked to reduce costs; for our test data, queries are evaluated in 2% of the memory of the standard  ...  The principle of the index design is that inverted lists are sorted by decreasing within-document frequency rather than by document number, and this method experimentally reduces cpu time and disk traffic  ...  Uncompressed inverted files Our structure for inverted files, where documents in inverted lists are ordered by decreasing f d,t , would also be effective in systems that use uncompressed inverted files  ... 
doi:10.1002/(sici)1097-4571(199610)47:10<749::aid-asi3>3.0.co;2-2 fatcat:6xb43kjkmnek7fahu5hoebzscm

On the Cost of Phrase-Based Ranking

Matthias Petri, Alistair Moffat
2015 Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '15  
suffix-array and inverted file-based phrase retrieval indexes using a standard IR test collection.  ...  Effective postings list compression techniques, and the efficiency of postings list processing schemes such as WAND, have significantly improved the practical performance of ranked document retrieval using  ...  This work was funded by the Australian Research Council's Discovery Project scheme (project DP140103256), and by the Victorian Life Sciences Computation Initiative (grant VR0052), an initiative of the  ... 
doi:10.1145/2766462.2767769 dblp:conf/sigir/PetriM15 fatcat:hrfkmjnqpnbifp5t635nrilorm

The efficiency of inverted index and cluster searches

Ellen M. Voorhees
1986 Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '86  
Recent research by the author corroborates these findings and also shows that the partial ranking produced by a topdown search of the complete link hierarchy can be more effective than ranking the collection  ...  The efficiency of each search is measured in two ways: the number of bytes required to store the auxiliary files required by the search, and the mean time required to retrieve a set of documents for a  ...  Acknowledgments Thanks go to Professor Gerard Salton and the members of the SMART group at Cornell for many helpful discussions, and especially for suggestions on how to improve the efficiency of the cluster  ... 
doi:10.1145/253168.253203 dblp:conf/sigir/Voorhees86 fatcat:llduxqr2gfg4xf6xkofe4oycge

Filtered document retrieval with frequency‐sorted indexes

Michael Persin, Justin Zobel, Ron Sacks‐Davis
1996 Journal of the American Society for Information Science  
We propose an evaluation technique that uses early recognition of which documents are likely to be highly ranked to reduce costs; for our test data, queries are evaluated in 2% of the memory of the standard  ...  The principle of the index design is that inverted lists are sorted by decreasing within-document frequency rather than by document number, and this method experimentally reduces cpu time and disk traffic  ...  Uncompressed inverted files Our structure for inverted files, where documents in inverted lists are ordered by decreasing f d,t , would also be effective in systems that use uncompressed inverted files  ... 
doi:10.1002/(sici)1097-4571(199610)47:10<749::aid-asi3>3.3.co;2-u fatcat:b5ojxkacyrb3rgjwb4unph2lxy

Efficient distributed algorithms to build inverted files

Berthier Ribeiro-Neto, Edleno S. Moura, Marden S. Neubert, Nivio Ziviani
1999 Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '99  
We present three distributed algorithms to build global inverted files for very large text collections.  ...  The distributed environment we use is a high bandwidth network of workstations with a shared-nothing memory organization.  ...  In [8, 91, however, it is proposed to sort the inverted lists in the inverted file by the frequency of occurrence of terms in documents.  ... 
doi:10.1145/312624.312663 dblp:conf/sigir/Ribeiro-NetoMNZ99 fatcat:kx2vsttusncltk7lyzf3bnczxa

Top-k Ranked Document Search in General Text Databases [chapter]

J. Shane Culpepper, Gonzalo Navarro, Simon J. Puglisi, Andrew Turpin
2010 Lecture Notes in Computer Science  
Our best approach is significantly faster than existing methods in RAM, and is even three times faster than a state-of-the-art inverted file implementation for English text when word queries are issued  ...  Text search engines return a set of k documents ranked by similarity to a query.  ...  In fact, the new ranked document search algorithms are three times faster than a highly tuned inverted file implementation that assumes terms to be English words.  ... 
doi:10.1007/978-3-642-15781-3_17 fatcat:dftjikmzibghjbp5bxvyvx2bri

Inverted files for text search engines

Justin Zobel, Alistair Moffat
2006 ACM Computing Surveys  
In this tutorial, we introduce the key techniques in the area, describing both a core implementation and how the core can be enhanced through a range of extensions.  ...  The development of a family of new index representations has led to a wide range of innovations in index storage, index construction, and query evaluation.  ...  Jamie Callan, Bruce Croft, Donna Harman, Mike Lesk, and Ellen Voorhees helped us identify some of the early work in the area.  ... 
doi:10.1145/1132956.1132959 fatcat:u56re4tqtfg6zcpyfnzl5ne57m

Structured information retrieval in XML documents

Evangelos Kotsakis
2002 Proceedings of the 2002 ACM symposium on Applied computing - SAC '02  
We employ a class of queries that support path expressions and suggest an efficient index, which extends the inverted file structure to search XML documents.  ...  This is accomplished by integrating the XML structure in the inverted file by combining the inverted file with a path index.  ...  Indexing structures for documents are discussed in [7] and for structured documents in [12] . Inverted files and signature files have been used only for searching literal terms.  ... 
doi:10.1145/508791.508919 dblp:conf/sac/Kotsakis02 fatcat:si4t5pv7mbbrditkjyh4vzkbea

Structured information retrieval in XML documents

Evangelos Kotsakis
2002 Proceedings of the 2002 ACM symposium on Applied computing - SAC '02  
We employ a class of queries that support path expressions and suggest an efficient index, which extends the inverted file structure to search XML documents.  ...  This is accomplished by integrating the XML structure in the inverted file by combining the inverted file with a path index.  ...  Indexing structures for documents are discussed in [7] and for structured documents in [12] . Inverted files and signature files have been used only for searching literal terms.  ... 
doi:10.1145/508909.508919 fatcat:rvo2r2alizf2tnfilosbcczpyq

Page 6 of Journal of Research and Practice in Information Technology Vol. 26, Issue 1 [page]

1994 Journal of Research and Practice in Information Technology  
One area of current investigation is the use of techniques that allow inverted file entries to be intersected with only partial decoding being necessary, using self-indexing in- verted file entries of  ...  Using the “local Golomb” code, the final inverted file required 132 Mb, or 6.4% of the input text. Of this, about 40 Mb is the gamma coded “frequency-within-document” val- ues.  ... 

Effect of Inverted Index Partitioning Schemes on Performance of Query Processing in Parallel Text Retrieval Systems [chapter]

B. Barla Cambazoglu, Aytul Catal, Cevdet Aykanat
2006 Lecture Notes in Computer Science  
Performance results are reported for a large (30 GB) document collection using an MPI-based parallel query processing implementation.  ...  Shared-nothing, parallel text retrieval systems require an inverted index, representing a document collection, to be partitioned among a number of processors.  ...  For document ranking, they use the vector-space model and conduct their experiments on a real-life document collection.  ... 
doi:10.1007/11902140_75 fatcat:p5xepgwmxbg75fjxbq26zsmrky

Fast Concurrency Control for Distributed Inverted Files [chapter]

Mauricio Marín
2005 Lecture Notes in Computer Science  
A new method for controlling concurrent read/write operations upon inverted files is proposed and evaluated.  ...  Communication and synchronization among processors is effected by ways of the bulksynchronous parallel model of computing.  ...  The whole collection of documents is used to produce a single inverted file index which is identical to the sequential one.  ... 
doi:10.1007/11428831_51 fatcat:3fjfrgivqfawjcl7ilt5ahdjpu
« Previous Showing results 1 — 15 out of 11,383 results