A Comprehensive Comparison of Multiparty Secure Additions with Differential Privacy

Slawomir Goryczka, Li Xiong
2017 IEEE Transactions on Dependable and Secure Computing  
This paper considers the problem of secure data aggregation (mainly summation) in a distributed setting, while ensuring differential privacy of the result. We study secure multiparty addition protocols using well known security schemes: Shamir's secret sharing, perturbation-based, and various encryptions. We supplement our study with our new enhanced encryption scheme EFT, which is efficient and fault tolerant. Differential privacy of the final result is achieved by either distributed Laplace
more » ... Geometric mechanism (respectively DLPA or DGPA), while approximated differential privacy is achieved by diluted mechanisms. Distributed random noise is generated collectively by all participants, which draw random variables from one of several distributions: Gamma, Gauss, Geometric, or their diluted versions. We introduce a new distributed privacy mechanism with noise drawn from the Laplace distribution, which achieves smaller redundant noise with efficiency. We compare complexity and security characteristics of the protocols with different differential privacy mechanisms and security schemes. More importantly, we implemented all protocols and present an experimental comparison on their performance and scalability in a real distributed environment. Based on the evaluations, we identify our security scheme and Laplace DLPA as the most efficient for secure distributed data aggregation with privacy. Example Applications-We provide a few application scenarios to motivate our problem, including smart metering, syndromic surveillance, and intelligence data collection, among others. Smart meters measure and report electrical usage with granularity of minutes compared to traditional meters, which are read once a month. The collection of information on electricity demands for all household is useful in preventing blackouts, increasing efficiency of the electrical grid, and reducing its impact on the environment, etc. The same information reveals a lot about individual consumers and their habits. An electricity usage profile of a single household may reveal information about electric appliances used in the household as well as the number and routines of its inhabitants [14] , [46] . These privacy concerns could be significantly reduced, when such usage data are aggregated and anonymized before being reported to the utility company. A group of neighboring houses that share the same infrastructure could securely sum up their electrical usage, and apply privacy mechanisms to preserve privacy of the individual consumers (Fig. 1) . Syndromic surveillance systems monitor population in the real time for early detection of large-scale disease outbreaks and bioterrorism attacks [10], [54] . In a simple scenario, a public health agency collects data from individual visitors who report their Influenza symptoms (self surveillance). The number of detected cases can then be aggregated and anonymized based on location of individuals and their demographics. The collected data can be monitored and analyzed to detect seasonal epidemic outbreaks. Intelligence data collection, in numerous situations, is performed in crowd settings both non-deliberately by the general public and by principals, who are anonymously embedded in the crowds. A canonical example is an uprising in a major city under hostile governmental control -the general public uses smart devices to report on various field data (third party surveillance [35] ). In this case, the number of participants reporting data may change over time, and it is important to protect security of the participants (data contributors) as there is no trusted aggregator as well as privacy of data subjects. System Settings-We consider a dynamic set of data contributors that contribute their own data (self surveillance) or other data (third party surveillance) in a monitoring or surveillance system. In our running example, contributors D i (1 ≤ i ≤ n) collect and contribute data x i independently, the goal is to compute an aggregate function f(x 1 , . . ., x n ) of the data. In the self surveillance scenarios, the individuals represented in the collected data (data subjects) are also data contributors. We assume that collected data are used by an untrusted application or an application run by an untrusted party for analysis and modeling (e.g. disease outbreak detection or intelligence analysis). Hence our security and privacy goals are to guarantee: 1) the intermediate results are not revealed to each other during the distributed aggregation besides the final result, 2) the final result (with a random noise R) does not compromise the privacy of individuals represented in the data (Fig. 2) . Privacy of data subjects is defined by differential privacy, which is the state-of-the-art privacy notion [21] , [22] , [36] that assures a strong and provable privacy guarantee for aggregated data. It requires negligible change of computation results, when a single data Goryczka and Xiong
doi:10.1109/tdsc.2015.2484326 pmid:28919841 pmcid:PMC5598559 fatcat:bzlmwjspazg2fjtt5vprz37a7e