Separation Logic-Assisted Code Transformations for Efficient High-Level Synthesis

Felix Winterstein, Samuel Bayliss, George A. Constantinides
<span title="">2014</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="" style="color: black;">2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines</a> </i> &nbsp;
The capabilities of modern FPGAs permit the mapping of increasingly complex applications into reconfigurable hardware. High-level synthesis (HLS) promises a significant shortening of the FPGA design cycle by raising the abstraction level of the design entry to high-level languages such as C/C++. Applications using dynamic, pointer-based data structures and dynamic memory allocation, however, remain difficult to implement well, yet such constructs are widely used in software. Automated
more &raquo; ... ons that aim to leverage the increased memory bandwidth of FPGAs by distributing the application data over separate banks of on-chip memory are often ineffective in the presence of dynamic data structures, due to the lack of an automated analysis of pointer-based memory accesses. In this work, we take a step towards closing this gap. We present a static analysis for pointer-manipulating programs which automatically splits heap-allocated data structures into disjoint, independent regions. The analysis leverages recent advances in separation logic, a theoretical framework for reasoning about heap-allocated data which has been successfully applied in recent software verification tools. Our algorithm focuses on dynamic data structures accessed in loops and is accompanied by automated source-to-source transformations which enable automatic loop parallelization and memory partitioning by off-the-shelf HLS tools. We demonstrate the successful loop parallelization and memory partitioning by our tool flow using three real-life applications which build, traverse, update and dispose dynamically allocated data structures. Our case studies, comparing the automatically parallelized to the non-parallelized HLS implementations, show an average latency reduction by a factor of 2.5 across our benchmarks.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="">doi:10.1109/fccm.2014.11</a> <a target="_blank" rel="external noopener" href="">dblp:conf/fccm/WintersteinBC14</a> <a target="_blank" rel="external noopener" href="">fatcat:nvaonfnp6vhnlhoi4phn7lfqzm</a> </span>
<a target="_blank" rel="noopener" href="" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href=""> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> </button> </a>