A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Static score bucketing in inverted indexes
2005
Proceedings of the 14th ACM international conference on Information and knowledge management - CIKM '05
This heuristic, however, increases the cost of index generation and requires complex index build algorithms. In this paper, we study a new index organization based on static score bucketing. ...
Maintaining strict static score order of inverted lists is a heuristic used by search engines to improve the quality of query results when the entire inverted lists cannot be processed. ...
INDEXING WITH STATIC SCORE BUCKETING In previous work, Long and Suel [6] have proposed an inverted lists organization that is based on a static rank order of the postings. ...
doi:10.1145/1099554.1099642
dblp:conf/cikm/BotevEFLS05
fatcat:wclbl5afs5dgdotmlo2tra7d3q
Embellishing text search queries to protect user privacy
2010
Proceedings of the VLDB Endowment
In this paper, we identify the privacy risks arising from semantically related search terms within a query, and from recurring highspecificity query terms in a search session. ...
We also provide an accompanying retrieval scheme that enables the search engine to compute the encrypted document relevance scores from only the genuine search terms, yet remain oblivious to their distinction ...
The similarity scoring model with the inverted index implementation are used extensively in modern document retrieval systems. They also form the foundation of Web search engines. ...
doi:10.14778/1920841.1920918
fatcat:rcr6id53tva33cyirgugr45574
Evaluation of a bedside test of utricular function – the bucket test – in older individuals
2014
Acta Oto-Laryngologica
Dizziness Handicap Index (DHI), in 51 older individuals aged 70-95 years. ...
Results-Bucket test scores are correlated in both magnitude and direction with utricle-selective tap-evoked oVEMP asymmetry ratios, but not with sound-evoked cVEMP asymmetry ratios, which are saccule-selective ...
Table II Association between bucket test outcome and clinical variables. The values given in bold denote statistical significance. AR, asymmetry ratio; DHI, Dizziness Handicap Index. ...
doi:10.3109/00016489.2013.867456
pmid:24460151
pmcid:PMC4285154
fatcat:jviww2sgkfaqhmzwywwhrr7sqa
Cache-conscious performance optimization for similarity search
2013
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13
Because of data sparsity, accessing feature vectors in memory for runtime comparison in the second stage, incurs significant overhead due to the presence of memory hierarchy. ...
All-pairs similarity search can be implemented in two stages. The first stage is to partition the data and group potentially similar vectors. ...
This work is supported in part by NSF IIS-1118106/0905084 and Kuwait University Scholarship. ...
doi:10.1145/2484028.2484077
dblp:conf/sigir/AlabduljalilTY13
fatcat:ptk7klfakbhbnpqwzdmdcs2wxi
Streaming similarity search over one billion tweets using parallel locality-sensitive hashing
2013
Proceedings of the VLDB Endowment
We show that this is an order of magnitude faster than existing indexing schemes, such as inverted indexes. ...
In this paper, we describe a new variant of LSH, called Parallel LSH (PLSH) designed to be extremely efficient, capable of scaling out on multiple nodes and multiple cores, and which supports highthroughput ...
ACKNOWLEDGEMENTS This work was supported by a grant from Intel, as a part of the Intel Science and Technology Center in Big Data (ISTC-BD). ...
doi:10.14778/2556549.2556574
fatcat:z7c2qdi2lvewlkamfphubde7ky
Improved techniques for result caching in web search engines
2009
Proceedings of the 18th international conference on World wide web - WWW '09
Finally, using the same approach, we also obtain performance gains for the related problem of inverted list caching. ...
Query processing is a major cost factor in operating large web search engines. In this paper, we study query result caching, one of the main techniques used to optimize query processing performance. ...
Acknowledgements: We thanks Xiaojun Hei for collaboration in the early stages of this work, and Keith Ross and Dan Rubenstein for valuable discussions of caching under Zipfian distributions. ...
doi:10.1145/1526709.1526768
dblp:conf/www/GanS09
fatcat:twihgvjmgretthjptzbqxyw3x4
Accelerating instant question search with database techniques
2011
Proceedings of the 20th international conference companion on World wide web - WWW '11
In this paper, we propose a user-support tool for composing questions in such services. ...
Distributed question answering services, like Yahoo Answer 1 and Aardvark 2 , are known to be useful for end users and have also opened up numerous topics ranging in many research fields. ...
When the number of buckets becomes too large, the buckets are projected into another set of buckets. ...
doi:10.1145/1963192.1963290
dblp:conf/www/EdaUBKCK11
fatcat:k43wwxmbpfe5tm4iax5bmqcoie
Compressing term positions in web indexes
2009
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '09
We focus on techniques for compressing term positions in web search engine indexes. ...
This has led to a lot of research on how to improve query throughput, using techniques such as massive parallelism, caching, early termination, and inverted index compression. ...
Inverted Index Compression Many different inverted index compression techniques have been proposed in the literature [28] . ...
doi:10.1145/1571941.1571969
dblp:conf/sigir/YanDS09
fatcat:2focpoagrjf3hm4qpnx5nvqkma
Performance of query processing implementations in ranking-based text retrieval systems using inverted indices
2006
Information Processing & Management
To our knowledge, six of these techniques are not discussed in any other publication before. ...
Similarity calculations and document ranking form the computationally expensive parts of query processing in ranking-based text retrieval. ...
An inverted index is composed of two parts: a set of inverted lists and an index into these lists. ...
doi:10.1016/j.ipm.2005.06.004
fatcat:v5xx6y2255arleq2u2vzm57hyu
Temporal Spatial-Keyword Top-k publish/subscribe
2015
2015 IEEE 31st International Conference on Data Engineering
Users are interested in receiving up-to-date tweets such that their locations are close to a user specified location and their texts are interesting to users. ...
The TaSK query takes into account text relevance, spatial proximity, and recency of geo-textual objects in evaluating its relevance with a geo-textual object. ...
ACKNOWLEDGMENT This work is supported in part by a grant awarded by a Singapore MOE AcRF Tier 2 Grant (ARC30/ 12). ...
doi:10.1109/icde.2015.7113289
dblp:conf/icde/ChenCCT15
fatcat:x7m4j4tkujgizgbi27nswq4jja
External-Memory Multimaps
[chapter]
2011
Lecture Notes in Computer Science
For example, the inverted file data structure that is used prevalently in the infrastructure supporting search engines is a type of multimap, where words are used as keys and document pointers are used ...
The key technique used to achieve our results is a combination of cuckoo hashing using buckets that hold multiple items with a multiqueue implementation to cope with varying numbers of values per key. ...
In this case, the multimap could be viewed as providing a dynamic functionality for a classic static data structure, known as an inverted file or inverted index (e.g., see Knuth [11] ). ...
doi:10.1007/978-3-642-25591-5_40
fatcat:cmzvr5xyhjad3icmh4a4fnm5my
External-Memory Multimaps
[article]
2011
arXiv
pre-print
For example, the inverted file data structure that is used prevalently in the infrastructure supporting search engines is a type of multimap, where words are used as keys and document pointers are used ...
The key technique used to achieve our results is a combination of cuckoo hashing using buckets that hold multiple items with a multiqueue implementation to cope with varying numbers of values per key. ...
In this case, the multimap could be viewed as providing a dynamic functionality for a classic static data structure, known as an inverted file or inverted index (e.g., see Knuth [11] ). ...
arXiv:1104.5533v2
fatcat:ghuoeyxt2jcbzmpkzndu6zqz7i
External-Memory Multimaps
2013
Algorithmica
For example, the inverted file data structure that is used prevalently in the infrastructure supporting search engines is a type of multimap, where words are used as keys and document pointers are used ...
The key technique used to achieve our results is a combination of cuckoo hashing using buckets that hold multiple items with a multiqueue implementation to cope with varying numbers of values per key. ...
In this case, the multimap could be viewed as providing a dynamic functionality for a classic static data structure, known as an inverted file or inverted index (e.g., see Knuth [11] ). ...
doi:10.1007/s00453-013-9770-7
fatcat:3cqa6u3njng6leyrn3gsud74je
Maguro, a system for indexing and searching over very large text collections
2013
Proceedings of the sixth ACM international conference on Web search and data mining - WSDM '13
Maguro is part of the serving stack in Bing and allows us to scale the index significantly better. ...
A long tail distribution of content calls for different trade-offs in the design space for good efficiency across the entire index range. ...
In addition, we would like to thank Qi Lu, Harry Shum, and Chad Walters for their support throughout the project. ...
doi:10.1145/2433396.2433486
dblp:conf/wsdm/RisvikCTKA13
fatcat:d2uz2xu7hvetlo4mjwdcsq63i4
Relevance Matters: Capitalizing on Less (Top-k Matching in Publish/Subscribe)
2012
2012 IEEE 28th International Conference on Data Engineering
The efficient processing of large collections of Boolean expressions plays a central role in major data intensive applications ranging from user-centric processing and personalization to real-time data ...
Finally, the performance of BE*-Tree is proven through a comprehensive experimental comparison against state-of-the-art index structures for matching Boolean expressions. ...
In contrast, a scalable top-k model, but based on a static and flat structure, with a generic scoring function, which also takes the event into consideration, is introduced in k-index [2] . ...
doi:10.1109/icde.2012.38
dblp:conf/icde/SadoghiJ12
fatcat:nzt236ufv5drfpeubtklphvy7m
« Previous
Showing results 1 — 15 out of 918 results