Benchmarking SciDB data import on HPC systems

Siddharth Samsi, Laura Brattain, William Arcand, David Bestor, Bill Bergeron, Chansup Byun, Vijay Gadepally, Matthew Hubbell, Michael Jones, Anna Klein, Peter Michaleas, Lauren Milechin (+6 others)
<span title="">2016</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/v67j4hwzdjcu3csnvox3vnxiry" style="color: black;">2016 IEEE High Performance Extreme Computing Conference (HPEC)</a> </i> &nbsp;
SciDB is a scalable, computational database management system that uses an array model for data storage. The array data model of SciDB makes it ideally suited for storing and managing large amounts of imaging data. SciDB is designed to support advanced analytics in database, thus reducing the need for extracting data for analysis. It is designed to be massively parallel and can run on commodity hardware in a high performance computing (HPC) environment. In this paper, we present the performance
more &raquo; ... of SciDB using simulated image data. The Dynamic Distributed Dimensional Data Model (D4M) software is used to implement the benchmark on a cluster running the MIT SuperCloud software stack. A peak performance of 2.2M database inserts per second was achieved on a single node of this system. We also show that SciDB and the D4M toolbox provide more efficient ways to access random sub-volumes of massive datasets compared to the traditional approaches of reading volumetric data from individual files. This work describes the D4M and SciDB tools we developed and presents the initial performance results. This performance was achieved by using parallel inserts, a in-database merging of arrays as well as supercomputing techniques, such as distributed arrays and single-program-multiple-data programming.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/hpec.2016.7761617">doi:10.1109/hpec.2016.7761617</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/hpec/SamsiBABBBGHJKM16.html">dblp:conf/hpec/SamsiBABBBGHJKM16</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/swikmonoura5tkgwowidvutaqe">fatcat:swikmonoura5tkgwowidvutaqe</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200825124256/https://arxiv.org/pdf/1609.07545v1.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/65/3d/653d5e7012a65d961c6e22fae48332095db3f315.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/hpec.2016.7761617"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>