2,053 Hits in 2.7 sec

Heavy-tailed distributions and multi-keyword queries

Surajit Chaudhuri, Kenneth Church, Arnd Christian König, Liying Sui
2007 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '07  
These multi-keyword indexes limit the number of postings accessed when computing arbitrary index intersections.  ...  of multi-keyword indexes, which significantly improve worst-case performance without requiring excessive storage.  ...  δ tail keywords from consideration.  ... 
doi:10.1145/1277741.1277855 dblp:conf/sigir/ChaudhuriCKS07 fatcat:2e7xp3ocqfcafpyjii6pewb5ae

Hurry-up: Scaling Web Search on Big/Little Multi-core Architectures [article]

Rajiv Nishtala and Xavier Martorell Norwegian University of Science and Technology, Barcelona Supercomputing center)
2019 arXiv   pre-print
We implement and deploy Hurry-up on a real 64-bit big/little architecture (ARM Juno), and show that, compared to a conservative policy on Linux, Hurry-up reduces the server tail latency by 39.5% (mean)  ...  Heterogeneous multi-core systems such as big/little architectures have been introduced as an attractive server design option with the potential to improve performance under power constraints in data centres  ...  Fig. 2 : 2 Query latency distribution on different number of cores (1 or 2) and types (big or little).  ... 
arXiv:1912.09844v1 fatcat:way7n4tdkfe6hlmrdn2mxecaym

Efficient Multi-Keyword Ranked Query on Encrypted Data in the Cloud

Zhiyong Xu, Wansheng Kang, Ruixuan Li, Kinchoong Yow, Cheng-Zhong Xu
2012 2012 IEEE 18th International Conference on Parallel and Distributed Systems  
Most current works only consider single keyword queries without appropriate ranking schemes. The multi-keyword query problem was being considered only recently.  ...  It introduces an efficient way to achieve management flexibility and economic savings for distributed applications.  ...  (also known as Zipf's Law and long tails), thus the access frequency (weight) always follows the heavy tail distributions.  ... 
doi:10.1109/icpads.2012.42 dblp:conf/icpads/XuKLYX12 fatcat:6dzrfn556bfvtaqe3lvzyycfrm

Efficient multi-keyword ranked query over encrypted data in cloud computing

Ruixuan Li, Zhiyong Xu, Wanshang Kang, Kin Choong Yow, Cheng-Zhong Xu
2014 Future generations computer systems  
In this paper, we propose a flexible multi-keyword query scheme, called MKQE to address the aforementioned drawbacks.  ...  In the current multi-keyword ranked search approach, the keyword dictionary is static and cannot be extended easily when the number of keywords increases.  ...  under grants 2013QN120, 2012TS052 and 2012TS053.  ... 
doi:10.1016/j.future.2013.06.029 fatcat:pdt7jrwobvd6ljft2zx2pfbalm

Efficient multi-keyword search over p2p web

Hanhua Chen, Hai Jin, Jiliang Wang, Lei Chen, Yunhao Liu, Lionel M. Ni
2008 Proceeding of the 17th international conference on World Wide Web - WWW '08  
We further argue that the intersection order between sets is important for multi-keyword search. Thus, we design optimal order strategies based on BF for both "and" and "or" queries.  ...  Other than single keyword search, multi-keyword search is quite popular and useful in many real applications.  ...  time it sees the tail.  ... 
doi:10.1145/1367497.1367631 dblp:conf/www/ChenJWCLN08 fatcat:v7vx7lx4yzdjlcjzmi55et3epi

Edge Hill Computing @ Interactive Social Book Search 2015

Daniel Campbell, Mark Michael Hall, David Walsh
2015 Conference and Labs of the Evaluation Forum  
We investigated what participants' first interactions with the collection are, how they interact with the multistage interface, and how users interactions with the multi-stage interface change over the  ...  combinations, and also that all but one of the long tail search terms (over 4 words) are in the goal-oriented tasks.  ...  Multi-stage Interface: Book Bag Interaction Figure 5 add-to-bookbag distributions for both the goal-oriented and open tasks are very similar, However there is a clear difference in the distribution of  ... 
dblp:conf/clef/CampbellHW15 fatcat:lc5tm2wosfcs5g5s3g5xya76ga

A tale of the tails: Power-laws in internet measurements

Aniket Mahanti, Niklas Carlsson, Anirban Mahanti, Martin Arlitt, Carey Williamson
2013 IEEE Network  
Two frequently occurring terms associated with these distributions, specifically heavy tails and long tails, are also discussed.  ...  First, we introduce power-laws and describe two commonly observed power-law distributions, namely the Pareto and Zipf distributions.  ...  A tail can be Pareto distributed (and heavy-tailed) even if the body of a distribution does not follow the power-law distribution.  ... 
doi:10.1109/mnet.2013.6423193 fatcat:qaneoc2h5fclvh73rkb64vlgwa

Request-Aware Scheduling for Busy Internet Services

J. Zhou, C. Zhang, T. Yang, L. Chu
2006 Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications  
Internet traffic is bursty and network servers are often overloaded with surprising events or abnormal client request patterns.  ...  This paper studies scheduling algorithms for interactive network services that use multiple threads to handle incoming requests continuously and concurrently.  ...  This work was supported by Ask Jeeves and NSF grant CCF-0234346.  ... 
doi:10.1109/infocom.2006.279 dblp:conf/infocom/ZhouZYC06 fatcat:rsztmyn4rfeg5gdcpmkaet7uby

Efficient Sampling for Better OSN Data Provisioning [article]

Nick Duffield, Balachander Krishnamurthy
2016 arXiv   pre-print
and recording transactions such as user updates) and APIs of the OSN provider (such as the Twitter API).  ...  Data concerning the users and usage of Online Social Networks (OSNs) has become available externally, from public resources (e.g., user profiles), participation in OSNs (e.g., establishing relationships  ...  The Challenge of Heavy Tails The approach of this paper builds on experience and methods from sampling Internet traffic flow records. The distribution of bytes per flow is heavy-tailed [11] .  ... 
arXiv:1612.04666v1 fatcat:5sarkvdukbaghebw7tuuyylwzy

The case for a wide-table approach to manage sparse relational data sets

Eric Chu, Jennifer Beckmann, Jeffrey Naughton
2007 Proceedings of the 2007 ACM SIGMOD international conference on Management of data - SIGMOD '07  
In particular, an RDBMS must 1) enable users to effectively build ad hoc queries over a very large number of attributes, and 2) support efficient evaluation of these queries over a wide, sparse table.  ...  We propose techniques that provide these capabilities, and argue that the single-table approach is a necessary component of selfmanaging database systems because it frees users from a tedious and potentially  ...  The last row and column show the percentage breakdowns of the Row-Num and Attr-Num distributions respectively. Table 1 reveals the heavy tails of the distributions.  ... 
doi:10.1145/1247480.1247571 dblp:conf/sigmod/ChuBN07 fatcat:cz34idr3dra2djuypm6mkhgarm

Has CEO Gender Bias Really Been Fixed? Adversarial Attacking and Improving Gender Fairness in Image Search

Yunhe Feng, Chirag Shah
Experiments on both simulated (three typical gender distributions) and real-world datasets demonstrate the proposed algorithms can mitigate gender bias effectively.  ...  , to re-rank returned images for given image queries.  ...  On the heavy-headed and heavy-tailed datasets, the 100 female items are distributed at the top 50% and the bottom 50% on the list respectively.  ... 
doi:10.1609/aaai.v36i11.21445 fatcat:t34b72rntvg3lodyh64cvcdiji

Query suggestion using hitting time

Qiaozhu Mei, Dengyong Zhou, Kenneth Church
2008 Proceeding of the 17th ACM conference on Information and knowledge mining - CIKM '08  
The proposed algorithm and its variations can successfully boost long tail queries, accommodating personalized query suggestion, as well as finding related authors in research.  ...  Without involvement of twisted heuristics or heavy tuning of parameters, this method clearly captures the semantic consistency between the suggested query and the original query.  ...  efficient efficient pattern mining frequent pattern frequent pattern data mining multi dimensional data mining Query kNN-keywords Hitting Time Suggestions Personalized PgRank learning dirichlet  ... 
doi:10.1145/1458082.1458145 dblp:conf/cikm/MeiZC08 fatcat:fxzg6tpu35czxe4is5byo7r4ma

Selective early request termination for busy internet services

Jingyu Zhou, Tao Yang
2006 Proceedings of the 15th international conference on World Wide Web - WWW '06  
This paper presents the design and implementation of this scheme and describes experimental results to validate the proposed approach.  ...  Internet traffic is bursty and network servers are often overloaded with surprising events or abnormal client request patterns.  ...  This work was supported by IAC Search & Media (formally Ask Jeeves) and NSF grant CCF-0234346.  ... 
doi:10.1145/1135777.1135866 dblp:conf/www/ZhouY06 fatcat:lssv3q7znvehbkj3yhinsuuedu

Diversity driven Query Rewriting in Search Advertising [article]

Akash Kumar Mohankumar, Nikit Begwani, Amit Singh
2021 arXiv   pre-print
For head and torso search queries, sponsored search engines use a huge repository of same intent queries and keywords, mined ahead of time.  ...  Online, this repository is used to rewrite the query and then lookup the rewrite in a repository of bid keywords contributing to significant revenue.  ...  For example:-decile 1 represents queries that are most frequent in number, but the overall count of such queries is less (usually under 100), similarly owing to the heavy tail nature of Figure 4 : Coverage  ... 
arXiv:2106.03816v1 fatcat:2oak2rrn5za4hdktscajmh7ioi

Long-tailed Extreme Multi-label Text Classification with Generated Pseudo Label Descriptions [article]

Ruohong Zhang, Yau-Shian Wang, Yiming Yang, Donghan Yu, Tom Vu, Likun Lei
2022 arXiv   pre-print
with the long tail of rare labels in highly skewed distributions.  ...  Extreme Multi-label Text Classification (XMTC) has been a tough challenge in machine learning research and applications due to the sheer sizes of the label spaces and the severe data scarce problem associated  ...  These numbers indicate that the label distributions are indeed highly skewed, with a heavy long tail in each corpus.  ... 
arXiv:2204.00958v1 fatcat:que7lt6h35al7mkn4zzhceesba
« Previous Showing results 1 — 15 out of 2,053 results