A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit <a rel="external noopener" href="https://link.springer.com/content/pdf/10.1007%2F978-3-030-18645-6_4.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
PHINEAS: An Embedded Heterogeneous Parallel Platform
[chapter]
<span title="">2019</span>
<i title="Springer International Publishing">
<a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a>
</i>
With machine learning being applied to increasingly varied domains, the computational needs of researchers have increased proportionately. Hobbyists, researchers and universities are turning to building their own cluster computers to meet their high performance compute needs. These clusters are typically highly efficient, low cost ARM based platforms consisting of between 4 and 8 nodes. In this paper, we present PHINEAS: Parallel Heterogeneous INdigenous Embedded ARM System, a parallel compute
<span class="external-identifiers">
<a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-030-18645-6_4">doi:10.1007/978-3-030-18645-6_4</a>
<a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/hpid3pse2veaxbrr2rez7dxk4q">fatcat:hpid3pse2veaxbrr2rez7dxk4q</a>
</span>
more »
... latform which allows for distributed computation using MPI and OpenMP and which further leverages the on-board GPU to perform general purpose compute tasks. We describe the hardware components of the cluster, the software stack installed on each node and a host of common benchmark algorithms and their results. The results show that the cluster meets the stringent latency requirements of embedded systems. We further describe how the on-board GPU's OpenGL ES 2.0 programming model can be used to implement tasks such as image convolution and neural network inference which are common in intelligent embedded systems. Parallelisation of compute tasks across multiple GPUs is discussed as a method to combine the advantages of distributed and heterogeneous computing.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200311120305/https://link.springer.com/content/pdf/10.1007%2F978-3-030-18645-6_4.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext">
<button class="ui simple right pointing dropdown compact black labeled icon button serp-button">
<i class="icon ia-icon"></i>
Web Archive
[PDF]
<div class="menu fulltext-thumbnail">
<img src="https://blobs.fatcat.wiki/thumbnail/pdf/da/93/da93c1734c6d073445ad99b615ee095fd411dc00.180px.jpg" alt="fulltext thumbnail" loading="lazy">
</div>
</button>
</a>
<a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-030-18645-6_4">
<button class="ui left aligned compact blue labeled icon button serp-button">
<i class="external alternate icon"></i>
springer.com
</button>
</a>