Relational ⋆-Liftings for Differential Privacy

Gilles Barthe, Thomas Espitau, Justin Hsu, Tetsuya Sato, Pierre-Yves Strub
2018 Logical Methods in Computer Science  
Recent developments in formal verification have identified approximate liftings (also known as approximate couplings) as a clean, compositional abstraction for proving differential privacy. This construction can be defined in two styles. Earlier definitions require the existence of one or more witness distributions, while a recent definition by Sato uses universal quantification over all sets of samples. These notions each have their own strengths: the universal version is more general than the
more » ... existential ones, while existential liftings are known to satisfy more precise composition principles. We propose a novel, existential version of approximate lifting, called -lifting, and show that it is equivalent to Sato's construction for discrete probability measures. Our work unifies all known notions of approximate lifting, yielding cleaner properties, more general constructions, and more precise composition theorems for both styles of lifting, enabling richer proofs of differential privacy. We also clarify the relation between existing definitions of approximate lifting, and consider more general approximate liftings based on f -divergences. The main strengths of differential privacy lie in its theoretical elegance, minimal assumptions, and flexibility. Recently, programming language researchers have developed approaches based on dynamic analysis, type systems, and program logics for formally proving differential privacy for programs. (We refer the interested reader to a recent survey (Barthe et al., 2016c) for an overview of this growing field.) In this paper, we consider approaches based on relational program logics Barthe et al., , 2016a Olmedo, 2014; Sato, 2016) . To capture the quantitative nature of differential privacy, these systems rely on a quantitative generalization of probabilistic couplings (see, e.g., (Lindvall, 2002; Thorisson, 2000; Villani, 2008) ), called approximate liftings or (ε, δ)-liftings. Prior works have considered several potential definitions. While all definitions support compositional reasoning and enable program logics that can verify complex examples from the privacy literature, the various notions of approximate liftings have different strengths and weaknesses. Broadly speaking, the first class of definitions require the existence of one or two witness distributions that "couple" the output distributions from program executions on two related inputs (intuitively, the true database and the true database omitting one individual's record). The earliest definition (Barthe et al., 2013) supports accuracy-based reasoning for the Laplace mechanism, while subsequent definitions Olmedo, 2014 ) support more precise composition principles from differential privacy and can be generalized to other notions of distance on distributions. These definitions, and their associated program logics, were designed for discrete distributions. In the course of extending these ideas to continuous distributions, Sato (2016) proposes a radically different notion of approximate lifting that does not rely on witness distributions. Instead, it uses a universal quantification over all sets of samples. Sato shows that this definition is strictly more general than the existential versions, but it is unclear (a) whether the gap can be closed and (b) whether his construction satisfies the same composition principles enjoyed by some existential definitions. As a consequence, no single definition is known to satisfy the properties needed to support all existing formalized proofs of differential privacy. Furthermore, some of the most involved privacy proofs cannot be formalized at all, as their proofs require a combination of constructions satisfied by existential or universal liftings, but not both. Outline of the paper. After introducing mathematical preliminaries in Section 2, we introduce our main technical contribution: a new, existential definition of approximate lifting. This construction, which we call -lifting, is a generalization of an existing definition by Olmedo (2014) . The key idea is to slightly enlarge the domain of witness distributions with a single generic point, broadening the class of approximate liftings. By a maximum flow/minimum cut argument, we show that -liftings are equivalent to Sato's lifting over discrete distributions. This equivalence can be viewed as an approximate version of Strassen's theorem (Strassen, 1965) , a classical result in probability theory characterizing the existence of probabilistic couplings. We present our definition and the proof of equivalence in Section 3. Then, we show that -liftings satisfy desirable theoretical properties by leveraging the equivalence of liftings in two ways. In one direction, Sato's definition gives simpler proofs of more general properties of -liftings. In the other direction, -liftings-like previously proposed existential liftings-can smoothly incorporate composition principles from the theory of differential privacy. In particular, our connection shows that Sato's definition Vol. 15:4 RELATIONAL -LIFTINGS FOR DIFFERENTIAL PRIVACY 18:3 can use these principles in the discrete case. We describe the key theoretical properties of -liftings in Section 4. Finally, we provide a thorough comparison of -lifting with other existing definitions of approximate lifting in Section 5, introduce a symmetric version of -lifting that satisfies the so-called advanced composition theorem from differential privacy Dwork et al. (2010) in Section 6, and generalize -liftings to approximate liftings based on f -divergences in Section 7. Overall, the equivalence of -liftings and Sato's lifting, along with the natural theoretical properties satisfied by the common notion, suggest that these definitions are two views on the same concept: an approximate version of probabilistic coupling. Background To model probabilistic data, we work with discrete sub-distributions. Definition 2.1. A sub-distribution over a set A is defined by its mass function µ : A → [0, 1], which gives the probability of the singleton events a ∈ A. This mass function must be s.t. |µ| = a∈A µ(a) is well-defined and at most 1. In particular, the support supp(µ) = {a ∈ A | µ(a) = 0} must be discrete (i.e. finite or countably infinite). When the weight |µ| is equal to 1, we call µ a (proper) distribution. We let D(A) denote the set of sub-distributions over A. Events E are predicates on A; the probability of an event E(x) w.r.t. µ, written P x∼µ [E(x)] or P µ [E]
doi:10.23638/lmcs-15(4:18)2019 fatcat:br7pgyikijce5obdwca6emnymq