Multi-Key Searchable Encryption, Revisited [chapter]

Ariel Hamlin, Abhi Shelat, Mor Weiss, Daniel Wichs
2018 Lecture Notes in Computer Science  
We consider a setting where users store their encrypted documents on a remote server and can selectively share documents with each other. A user should be able to perform keyword searches over all the documents she has access to, including the ones that others shared with her. The contents of the documents, and the search queries, should remain private from the server. This setting was considered by Popa et al. (NSDI '14) who developed a new cryptographic primitive called Multi-Key Searchable
more » ... cryption (MKSE), together with an instantiation and an implementation within a system called Mylar, to address this goal. Unfortunately, Grubbs et al. (CCS '16) showed that the proposed MKSE definition fails to provide basic security guarantees, and that the Mylar system is susceptible to simple attacks. Most notably, if a malicious Alice colludes with the server and shares a document with an honest Bob then the privacy of all of Bob's search queries is lost. In this work we revisit the notion of MKSE and propose a new strengthened definition that rules out the above attacks. We then construct MKSE schemes meeting our definition. We first give a simple and efficient construction using only pseudorandom functions. This construction achieves our strong security definition at the cost of increasing the server storage overhead relative to Mylar, essentially replicating the document each time it is shared. We also show that high server storage overhead is not inherent, by giving an alternate (albeit impractical) construction that manages to avoid it using obfuscation. Overview of Our MKSE Definition. We consider users that can take on two types of roles: data owners and queriers. Data owners have a document they wish to share with some subset of the users. Each document has its own associated data key K d , where the data owner "encrypts" the document using this key, and uploads the encrypted document to the server. Each user has a query key K u that it uses to issue search queries. When a data owner shares a document d with a user u they create a share key ∆ u,d which depends on the keys K u , K d , as well as the encrypted document, and store ∆ u,d on the server. When a querier wants to search for some keyword, he "encrypts" the keyword using his query key K u , and sends the resulting encrypted query to the server. For each document d that was shared with the user u, the server uses the share key ∆ u,d to execute the encrypted query over the encrypted document, and learns if the keyword is contained in that document. This allows the server to return all relevant documents the querier has access to and which contain the keyword. The main syntactic difference between our notion, and the MKSE notion used in Mylar, is in how the share key ∆ u,d is generated. As noted above, the share key in Mylar depends only on the keys K u , K d , whereas in our notion it also depends on the encrypted document. By tying the share key to the document, we can ensure that each query can only be executed on the specific documents that were shared with the querier, rather than on arbitrary documents, even if the server has the key K d . To define security, we consider a share graph between data owners (documents) and queriers, representing who shares data with whom, where some subset of data owners are malicious and collude with the server. The desired security guarantee is that the server learns nothing about the contents of the documents belonging to the honest data owners, or the keywords being queried, beyond the access pattern of which documents are returned by each query (i.e., out of the documents shared with the querier, which ones contain the queried keyword). We provide an indistinguishability-based definition where the adversary chooses the documents and data keys belonging to the malicious data owners, and two potential values (a "left" and a "right" value) for each query and each document belonging to an honest data owner. The left and right values must lead to the same access pattern, and repeated queries must appear in the same locations in both query sequences. The adversary then gets all encrypted documents, share keys, and encrypted queries, and should not be able to distinguish whether these were created using the left or right values. Since the adversary only learns the access pattern of which documents are returned by each query, the above definition captures the minimal leakage for schemes that reveal the access pattern, which seems to be the case in all practical schemes. This is a significant qualitative improvement over the leakage allowed by the previous definition of [PZ13] and the corresponding schemes. Most importantly, when a malicious user Mallory is colluding with the sever and shares some data with Bob, the previous schemes completely leaked the contents of Bob's query wheres our definition still only reveals the access pattern. We note that similar to single-key SSE, leaking the access pattern does reveal some potentially sensitive information and in some scenarios (e.g., when combined with auxiliary information about the documents) this may allow a sufficiently powerful attacker to completely recover the query, as shown in the single-key SSE setting by the recent works [CGPR15,
doi:10.1007/978-3-319-76578-5_4 fatcat:vs72r43yf5hxnmnsbnjscuh3t4