88,717 Hits in 11.3 sec

BLENDER: Enabling Local Search with a Hybrid Differential Privacy Model [article]

Brendan Avent, Aleksandra Korolova, David Zeber, Torgeir Hovden, Benjamin Livshits
2018 arXiv   pre-print
This blended approach provides significant improvements in the utility of obtained data compared to related work while providing users with their desired privacy guarantees.  ...  We demonstrate that within this model, it is possible to design a new type of blended algorithm for the task of privately computing the head of a search log.  ...  Body 1 : 1 let N (r, D) = number of times an arbitrary record r appears in the given dataset D. 2: for each user i ∈ T do 3: . 14: let HL map the M queries with the highest estimated marginal probabilities  ... 
arXiv:1705.00831v3 fatcat:2xqkvq6lsvhstjzw6gfer3m2iy

BLENDER: Enabling Local Search with a Hybrid Differential Privacy Model

Brendan Avent, Aleksandra Korolova, David Zeber, Torgeir Hovden, Benjamin Livshits
2019 Journal of Privacy and Confidentiality  
We demonstrate that within this model, it is possible to design a new type of blended algorithm that improves the utility of obtained data, while providing users with their desired privacy guarantees.  ...  We propose a hybrid model of differential privacy that considers a combination of regular and opt-in users who desire the differential privacy guarantees of the local privacy model and the trusted curator  ...  Body 1 : 1 let N (r, D) = number of times an arbitrary record r appears in the given dataset D. 2: for each user i ∈ T do 3: . 14: let HL map the M queries with the highest estimated marginal probabilities  ... 
doi:10.29012/jpc.680 fatcat:nyh3uctasjfm5c4hxoewyhdhma

Optimization of signature file parameters for databases with varying record lengths

S Kocberber
1999 Computer journal  
In signature file processing, accurate estimation of the number of false drops is essential to obtain a more accurate signature file and therefore to obtain a better (query) response time.  ...  With a formal proof we show that under certain conditions the number of false drops estimated by considering the average record length is less than or equal to the precise 'expected' estimation which is  ...  ACKNOWLEDGEMENTS We greatly appreciate the constructive criticism provided by the anonymous referees. Their suggestions have greatly improved both the contents and presentation of the paper.  ... 
doi:10.1093/comjnl/42.1.11 fatcat:nkjf4v7tpzc3tba6edm2jairgi

Selectivity Estimation on Set Containment Search

Yang Yang, Wenjie Zhang, Ying Zhang, Xuemin Lin, Liping Wang
2019 Data Science and Engineering  
Given a query record Q and a record dataset S , we aim to accurately and efficiently estimate the selectivity of set containment search of query Q over S .  ...  In this paper, we study the problem of selectivity estimation on set containment search.  ...  Given a query record Q and a dataset S , we aim to accurately and efficiently estimate the selectivity of the set containment search of Q on S.  ... 
doi:10.1007/s41019-019-00104-1 fatcat:ogbv2g77ozhujaocm5rgmxf5zi

A security machanism for statistical database

Leland L. Beck
1980 ACM Transactions on Database Systems  
It is shown that the number of queries required to compromise the database can be made arbitrarily large by accepting moderate increases in the variance of responses to queries.  ...  It is assumed that the database allows "total," "average," "count," and "percentile" queries; a query may refer to any arbitrary subset of the database.  ...  Schmeiser for many long and fruitful discussions concerning the subject of this paper, and for his suggestion of the development of a general model of user inference.  ... 
doi:10.1145/320613.320617 fatcat:hryl6p44zzbuzhh3xu5pxauzc4

Experimental evaluation of selectivity estimation on big spatial data

Harry Chasparis, Ahmed Eldawy
2017 Proceedings of the Fourth International ACM Workshop on Managing and Mining Enriched Geo-Spatial Data - GeoRich '17  
One of the fundamental spatial queries is the selectivity estimation problem where users want to quickly estimate the total number of records in a given query range.  ...  deciding when to use each of these techniques based on the application requirements.  ...  Problem De nition: Selectivity Estimation e selectivity estimation problem is to calculate the number of points in a given query range.  ... 
doi:10.1145/3080546.3080553 dblp:conf/sigmod/ChasparisE17 fatcat:2c3lqr5ihfeslmdwfik2woipui

Partial evaluation of queries for bit-sliced signature files

Seyit Kocberber, Fazli Can
1996 Information Processing Letters  
Under the sequentiality assumption of disk blocks, P-BSSF provides a desirable response time of 1 second for a database size of one million records with a 28% space overhead.  ...  The analysis shows that, with 14% increase in space overhead, P-BSSF provides a query processing time improvement of more than 85% for multi-term query environments with respect to the best performance  ...  We repeat the formulas to compute the number of on-bits in the query signature (query weight) and the expected number of false drops given in [8] .  ... 
doi:10.1016/s0020-0190(96)00176-7 fatcat:r7joss3rwvaqlabawbs5dsfehy

DPCube: Differentially Private Histogram Release through Multidimensional Partitioning [article]

Yonghui Xiao, Li Xiong, Liyue Fan, Slawomir Goryczka
2012 arXiv   pre-print
Finally, we implement and experimentally evaluate several applications using the released histograms, including counting queries, classification, and blocking for record linkage and show the benefit of  ...  We formally analyze the utility of the released histograms and quantify the errors for answering linear queries such as counting queries.  ...  Figure 15 (a) shows the average absolute query error with respect to varying threshold values.  ... 
arXiv:1202.5358v1 fatcat:yrlnlexdhrge3nlkpagnybjsd4

Diverse Mobile System for Location-Based Mobile Data

Qing Liao, Haoyu Tan, Wuman Luo, Ye Ding
2018 Wireless Communications and Mobile Computing  
In this work, we design a storage system based on diverse replica scheme which not only can improve the query processing efficiency but also can reduce the cost of storage space.  ...  Moreover, we propose an effective approach to select an appropriate set of diverse replicas, which is optimized for the expected query loads while conforming to the given storage space budget.  ...  We estimate the cost of a query with respect to a replica via the expectation of the running time towards the replica.  ... 
doi:10.1155/2018/4217432 fatcat:y4xqigzeibh4hkih7wdjrth7wy

Modeling and Predicting DNS Server Load [article]

Zheng Wang
2016 arXiv   pre-print
To gain the desired balance, TTL adjustment depends on predictions of query loads under alternative TTLs.  ...  This paper proposes a model of DNS server load, which employs the uniform aggregate caching model to simplify the complexity of modeling clients' requests and their caching.  ...  Given one TTL of the requested record τ 0 and the respective query load of authoritative servers B(τ 0 ), we can derive the equivalent aggregate request rate arriving at each caching resolver as A = B(  ... 
arXiv:1606.09530v1 fatcat:6jhrst2gkrbxrbwiungsu4bnae

Database selection for processing k nearest neighbors queries in distributed environments

Clement Yu, Prasoon Sharma, Weiyi Meng, Yan Qin
2001 Proceedings of the first ACM/IEEE-CS joint conference on Digital libraries - JCDL '01  
Histograms are constructed and algorithms are given to provide estimates of the desirabilities of the databases with respect to the given query.  ...  The paper concentrates on the processing of the structured component of a distributed query.  ...  Based on the histogram on each such attribute, an estimate is made on the desirability of each database with respect to the query. The databases are then ranked with respect to their desirability.  ... 
doi:10.1145/379437.379504 dblp:conf/jcdl/YuSMQ01 fatcat:jjuq6tyn3rhqleownm7akrbuwa

Defining and applying a method for improving the sensitivity and specificity of an emergency department early event detection system

Matthew J Scholer, George S Ghneim, Shiying Wu, Matt Westlake, Debbie A Travers, Anna E Waller, Anne-Lyne McCalla, Scott F Wetterhall
2007 AMIA Annual Symposium Proceedings  
Utilizing a stratified sampling method and expert review to create a gold standard dataset for the calculation of sensitivity and specificity, we describe how varying syndrome structure impacts these statistical  ...  The ability to calculate these values aids system designers in the refinement of syndrome definitions to better meet public health needs.  ...  Acknowledgements The authors would like to thank Jennifer MacFarquhar, Dennis Falls, Aaron Kipp and John Crouch for their contributions to this project.  ... 
pmid:18693917 pmcid:PMC2655810 fatcat:qovj5frlrranteuvmpeugx3kpy

Workload-Driven Antijoin Cardinality Estimation

Florin Rusu, Zixuan Zhuang, Mingxi Wu, Chris Jermaine
2015 ACM Transactions on Database Systems  
Given the widespread use of antijoin and subset-based queries in analytical workloads and the extensive research targeted at join cardinality estimation -a seemingly related problem -the lack of adequate  ...  Second, we design a Bayesian statistics framework that updates the superpopulation model according to the live queries, thus allowing the estimator to adapt dynamically to the online workload.  ...  In particular, the proof of Theorem 6.1 is largely provided by one of the anonymous reviewers.  ... 
doi:10.1145/2818178 fatcat:kk3zxh27njaafnnabvakupumlm

ECO-DNS: Expected Consistency Optimization for DNS

Chen Chen, Stephanos Matsumoto, Adrian Perrig
2015 2015 IEEE 35th International Conference on Distributed Computing Systems  
The flexibility of the current Domain Name System (DNS) has been stretched to its limits to accommodate new applications such as content delivery networks and dynamic DNS.  ...  In particular, maintaining cache consistency has become a much larger problem, as emerging technologies require increasingly-frequent updates to DNS records.  ...  We compare two designs for parameter estimation: a) counting the number of queries within a fixed-length time window, and b) calculating the duration given a fixed number of queries.  ... 
doi:10.1109/icdcs.2015.34 dblp:conf/icdcs/ChenMP15 fatcat:bnabfix2izdv7pjprfkv7ikwma

An Improved Approach for Estimating Social POI Boundaries With Textual Attributes on Social Media [article]

Cong Tran, Dung D. Vu, Won-Yong Shin
2020 arXiv   pre-print
The computational complexity of the proposed I-SoBEst algorithm is shown to scale linearly with the number of records.  ...  Thus, using SoBEst in such cases may possibly result in unsatisfactory performance on the boundary estimation quality (BEQ), which is expressed as a function of the F-measure.  ...  A two-phase algorithm for estimating a POI boundary with linear scaling complexity in the number of input records was proposed in [46] .  ... 
arXiv:2012.09990v1 fatcat:67sl2pnff5g5bfekv36ctlzkma
« Previous Showing results 1 — 15 out of 88,717 results