Private and Efficient Query Processing on Outsourced Genomic Databases

Reza Ghasemi, Md. Momin Al Aziz, Noman Mohammed, Massoud Hadian Dehkordi, Xiaoqian Jiang
2017 IEEE journal of biomedical and health informatics  
Applications of genomic studies are spreading rapidly in many domains of science and technology such as healthcare, biomedical research, direct-to-consumer services, and legal and forensic. However, there are a number of obstacles that make it hard to access and process a big genomic database for these applications. First, sequencing genomic sequence is a time-consuming and expensive process. Second, it requires large-scale computation and storage systems to processes genomic sequences. Third,
more » ... enomic databases are often owned by different organizations and thus not available for public usage. Cloud computing paradigm can be leveraged to facilitate the creation and sharing of big genomic databases for these applications. Genomic data owners can outsource their databases in a centralized cloud server to ease the access of their databases. However, data owners are reluctant to adopt this model, as it requires outsourcing the data to an untrusted cloud service provider that may cause data breaches. In this paper, we propose a privacypreserving model for outsourcing genomic data to a cloud. The proposed model enables query processing while providing privacy protection of genomic databases. Privacy of the individuals is guaranteed by permuting and adding fake genomic records in the database. These techniques allow cloud to evaluate count and top-k queries securely and efficiently. Experimental results demonstrate that a count and a top-k query over 40 SNPs in a database of 20,000 records takes around 100 and 150 seconds, respectively. Index Terms Genomic data private computation; privacy in cloud computing; homomorphic encryption; privacy in genomic data storage
doi:10.1109/jbhi.2016.2625299 pmid:27834660 pmcid:PMC5498255 fatcat:gbp3hki5kzh2hlnvsyqoziyb2a