Is There an Oblivious RAM Lower Bound?

Elette Boyle, Moni Naor
2016 Proceedings of the 2016 ACM Conference on Innovations in Theoretical Computer Science - ITCS '16  
An Oblivious RAM (ORAM), introduced by Goldreich and Ostrovsky (JACM 1996), is a (probabilistic) RAM that hides its access pattern, i.e. for every input the observed locations accessed are similarly distributed. Great progress has been made in recent years in minimizing the overhead of ORAM constructions, with the goal of obtaining the smallest overhead possible. We revisit the lower bound on the overhead required to obliviously simulate programs, due to Goldreich and Ostrovsky. While the lower
more » ... bound is fairly general, including the offline case, when the simulator is given the reads and writes ahead of time, it does assume that the simulator behaves in a "balls and bins" fashion. That is, the simulator must act by shuffling data items around, and is not allowed to have sophisticated encoding of the data. We prove that for the offline case, showing a lower bound without the above restriction is related to the size of the circuits for sorting. Our proof is constructive, and uses a bit-slicing approach which manipulates the bit representations of data in the simulation. This implies that without obtaining yet unknown superlinear lower bounds on the size of such circuits, we cannot hope to get lower bounds on offline (unrestricted) ORAMs. As noted in [GO96, WHC + 14], the [GO96] lower bound is very powerful, applying also for the offline case (where all the accesses are given in advance), for arbitrary block sizes, for several relevant overhead metrics, and even when tolerating up to O(1) statistical failure probability. E.g., • "This is almost optimal since the well-known logarithmic ORAM lower bound [GO96] is immediately applicable to the circuit size metric as well." [WCS14]. Altogether, the solidity of the Ω(log n) barrier would seem to be inescapable. Or is it? Reexamining the [GO96] bound. As is well recognized, the Goldreich-Ostrovsky work [GO96] provided a seminal foundation for understanding ORAM and its restrictions. Upon closer observation, however, one begins to see that the lower bound of [GO96] is not the end of the story. Despite being broadly interpreted as a hard lower bound, applying to all scenarios, the [GO96] bound actually bears significant limitations. "Balls and bins" storage. Perhaps most important, the [GO96] lower bound is within the restricted model of "balls and bins" data manipulation. Namely, the n data items are modeled as "balls," CPU registers and server-side data storage locations are modeled as "bins," and the set of allowed data operations consists only of moving balls between bins. This is a meaningful model and captures the natural class of approaches that was the focus of [GO96] and many others. However, it immediately precludes any ORAM construction approach making use of data encoding, leveraging alternative representations of information, or any other form of non-black-box data manipulation. Such techniques have been shown to surpass performance of analogous "black-box" approaches in several related tasks within computer science, such as improving overhead in distributed file sharing, and optimizing network throughput via network coding (e.g., [NR95, MS11] ). It is not clear whether the Ω(log n) bound extends at all once these strong restrictions are lifted, and in light of our work this is not going to be simple to show. Statistical security. The bound applies to ORAMs with statistical security: i.e., where the distribution of access patterns for two different inputs are statistically close. This statistical relation is crucial for the proof approach to proceed. However, in many cases statistical guarantees may be stronger than necessary. Interestingly enough, the constructions presented within the same original ORAM paper [GO96]-and in fact, all ORAM constructions for the following 15 years, until the works of Ajtai [Ajt10] and Damgard et al. [DMN11] in 2010-were not statistically secure. Rather, due to use of pseudorandom functions and related tools, they guaranteed only that the distributions of memory accesses were computationally indistinguishable. Whether such constructions could bypass the Ω(log n) bound is unknown.
doi:10.1145/2840728.2840761 dblp:conf/innovations/BoyleN16 fatcat:2tsbwudqpjfdxf5xnkcmqvkpfi