Arbitrary Boolean logical search operations on massive molecular file systems [article]

James L. Banal, Tyson R. Shepherd, Joseph D. Berleant, Hellen Huang, Miguel Reyes, Cheri M. Ackerman, Paul Blainey, Mark Bathe
2020 bioRxiv   pre-print
DNA is an ultra-high-density storage medium that could meet exponentially growing worldwide data storage demand. However, accessing arbitrary data subsets within exabyte-scale DNA data pools is limited by the finite addressing space for individual DNA-based blocks of data. Here, we form files by encapsulating data-encoding DNA within silica capsules that are surface-labeled with multiple unique barcodes. Barcoding is performed with single-stranded DNA representing file metadata that enables
more » ... ean logic selection on the entire pool of data. We demonstrate encapsulation and Boolean selection of sub-pools of image files using fluorescence-activated sorting, with selection sensitivity of 1 in 106 files per channel. Our strategy in principle enables retrieval of targeted data subsets from exabyte- and larger-scale data pools, thereby offering a random access file system for massive molecular data sets
doi:10.1101/2020.02.05.936369 fatcat:tdtpnqgtdja6jk7a3hf6e6646m