352 Hits in 5.3 sec

HERO: Heterogeneous Embedded Research Platform for Exploring RISC-V Manycore Accelerators on FPGA [article]

Andreas Kurth, Pirmin Vogel, Alessandro Capotondi, Andrea Marongiu, Luca Benini
2017 arXiv   pre-print
In this work we introduce HERO, an FPGA-based research platform that combines a PMCA composed of clusters of RISC-V cores, implemented as soft cores on an FPGA fabric, with a hard ARM Cortex-A multicore  ...  Heterogeneous embedded systems on chip (HESoCs) co-integrate a standard host processor with programmable manycore accelerators (PMCAs) to combine general-purpose computing with domain-specific, efficient  ...  GRVI Phalanx [13] is an array of clusters of RISC-V cores interconnected by a network on chip (NoC).  ... 
arXiv:1712.06497v1 fatcat:gwexa42crjb6nceyzieeflquyy

A Soft Processor Overlay with Tightly-coupled FPGA Accelerator [article]

Ho-Cheung Ng, Cheng Liu, Hayden Kwok-Hay So
2016 arXiv   pre-print
RISC-V designs.  ...  RISC-V is chosen as the instruction set for its openness and portability, and the soft processor is designed as a 4-stage pipeline to balance resource consumption and performance when implemented on FPGAs  ...  Evaluation of the Tightly-coupled Architecture 1) Experimental Setup: Four real-life applications including matrix-matrix multiplication (MM), finite impulse response (FIR) filter, K-mean clustering algorithm  ... 
arXiv:1606.06483v1 fatcat:5vb7ne7yyverjbxc5i5isybgnu

On the Performance and Isolation of Asymmetric Microkernel Design for Lightweight Manycores

Pedro Henrique Penna, Joao Vicente Souto, Davidson Francis Lima, Marcio Castro, Francois Broquedis, Henrique Freitas, Jean-Francois Mehaut
2019 2019 IX Brazilian Symposium on Computing Systems Engineering (SBESC)  
Multikernel operating systems (OSs) were introduced to match the architectural characteristics of lightweight manycores.  ...  Also, our results unveil co-design aspects between an OS kernel and the architecture of lightweight manycore, concerning the memory system and core grouping.  ...  It features an asymmetric design, which means that it exclusively runs in one core of the underlying cluster.  ... 
doi:10.1109/sbesc49506.2019.9046080 dblp:conf/sbesc/PennaSLCBFM19 fatcat:oe576cky2jfp3h4hynrszytmbq

Lightweight Software-Defined Error Correction for Memories [chapter]

Irina Alam, Lara Dolecek, Puneet Gupta
2020 Embedded Systems  
Most bit patterns decode to illegal instructions in three RISC ISAs that were characterized: 92.33% for RISC-V, 72.44% for MIPS, and 66.87% for Alpha.  ...  Note that although RISC-V is actually a little-endian architecture, for sake of clarity big-endian is used in this example.  ... 
doi:10.1007/978-3-030-52017-5_9 fatcat:afb6femj3zdjlofkkdtwdijiwe

Multicore distributed dictionary learning: A microarray gene expression biclustering case study

Stephen Laide, John McAllister
2017 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Specifically, for 56 test subjects each providing 12,625 gene expression levels a data matrix X ∈ R 56×12,625 may be decomposed according to: X = N k=1 w k y T k (2) Where w k ∈ R 56×1 and y k ∈ R 12,625  ...  This paper benchmarks the performance, scalability and suitability of the Adapteva Epiphany, a combination of 16 lightweight superscalar processors, when realising a distributed dictionary learning problem  ... 
doi:10.1109/icassp.2017.7952340 dblp:conf/icassp/LaideM17 fatcat:swjcn26zpzbftkxt2ltup6nkyq

Fast, Nearly Optimal ISE Identification With I/O Serialization Through Maximal Clique Enumeration

Ajay K. Verma, Philip Brisk, Paolo Ienne
2010 IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  
The last decade has witnessed the emergence of the application-specific instruction-set processor (ASIP) as a viable platform for embedded systems.  ...  Extensible ASIPs allow the user to augment a base processor with instruction set extensions (ISEs) that execute on dedicated hardware application-specific functional units (AFUs).  ...  We have also shown that for a RISC base processor the speedup model used to evaluate the benefit of an ISE is independent of the specific details of the microarchitecture.  ... 
doi:10.1109/tcad.2010.2041849 fatcat:axhpqdmmfnf3dptdald2vtemle

FANN-on-MCU: An Open-Source Toolkit for Energy-Efficient Neural Network Inference at the Edge of the Internet of Things [article]

Xiaying Wang, Michele Magno, Lukas Cavigelli, Luca Benini
2022 arXiv   pre-print
This paper also provides an architectural performance evaluation of neural networks on the most popular ARM Cortex-M family and the parallel RISC-V processor called Mr. Wolf.  ...  both the ARM Cortex-M series and the novel RISC-V-based Parallel Ultra-Low-Power (PULP) platform.  ...  family based on RISC-V ISA.  ... 
arXiv:1911.03314v3 fatcat:ejoluwg6lzblzgncqv5pkwls4a

Compressed Sensing Based Seizure Detection for an Ultra Low Power Multi-core Architecture

Roghayeh Aghazadeh, Fabio Montagna, Simone Benatti, Davide Rossi, Javad Frounchi
2018 2018 International Conference on High Performance Computing & Simulation (HPCS)  
The SoC, shown in Figure 2 , is a multi-core programmable processor coupling an advanced MCU controlled based on a tiny (12 Kgates) RISC-V processor (zero-risky) [14] accelerated by a powerful 8-processors  ...  t k − τ ))] 2 N k=1 sin 2 (2πf (t k − τ )) (3) where ∼ x and δ 2 are respectively the mean and the variance of the data and the τ is calculated from Eq  ... 
doi:10.1109/hpcs.2018.00083 dblp:conf/ieeehpcs/AghazadehMBRF18 fatcat:sv6u6jm5bbexle2cm6rpxia2gu

A 320 mW 342 GOPS Real-Time Dynamic Object Recognition Processor for HD 720p Video Streams

Jinwook Oh, Gyeonghoon Kim, Junyoung Park, Injoon Hong, Seungjin Lee, Joo-Young Kim, Jeong-Ho Woo, Hoi-Jun Yoo
2013 IEEE Journal of Solid-State Circuits  
multithreading feature extraction clusters, a cache-based feature matching processor and a machine learning engine.  ...  Index Terms-Multi-core processor, object recognition, scale invariant feature transform, heterogeneous, low power processor, dynamic resource management, dynamic voltage and frequency scaling.  ...  In these systems, a highly parallel SIMD architecture and a high bandwidth dual memory architecture are adopted to accelerate in-vehicle image recognition and K-means clustering algorithm respectively,  ... 
doi:10.1109/jssc.2012.2220651 fatcat:uh4ec3i64vdmjdev5sgck7iv2i

A 64mW DNN-based Visual Navigation Engine for Autonomous Nano-Drones

Daniele Palossi, Antonio Loquercio, Francesco Conti, Francesco Conti, Eric Flamand, Eric Flamand, Davide Scaramuzza, Luca Benini, Luca Benini
2019 IEEE Internet of Things Journal  
ACKNOWLEDGMENTS The authors thank Hanna Müller for her contribution in designing the PULP-Shield, Noé Brun for his support in making the camera-holder, and Frank K.  ...  The processor is composed of two separate power and clock domains, the FABRIC CTRL (FC) and the CLUSTER (CL).  ...  GAP8 Architecture Our deployment target for the bulk of the DroNet computation is GAP8, a commercial embedded RISC-V multicore processor derived from the PULP open source project 6 . 4  ... 
doi:10.1109/jiot.2019.2917066 fatcat:ogqpf3qzg5hc5hph6cgdccivxu

Spatz: A Compact Vector Processing Unit for High-Performance and Energy-Efficient Shared-L1 Clusters [article]

Matheus Cavalcante, Domenic Wüthrich, Matteo Perotti, Samuel Riedel, Luca Benini
2022 arXiv   pre-print
We propose Spatz, a compact, modular 32-bit vector processing unit based on the integer embedded subset of the RISC-V Vector Extension version 1.0.  ...  Those results show the viability of lean vector processors as high-performance and energy-efficient PEs for large-scale clusters with tightly-coupled L1 memory.  ...  We propose Spatz, a compact 32-bit vector machine based on the embedded subset of the RISC-V Vector Extension version 1.0 [17] .  ... 
arXiv:2207.07970v1 fatcat:kjfsd2de6vaitj6ezgfnf3rvom

Improving Resilience to Timing Errors by Exposing Variability Effects to Software in Tightly-Coupled Processor Clusters

Abbas Rahimi, Daniele Cesarini, Andrea Marongiu, Rajesh K. Gupta, Luca Benini
2014 IEEE Journal on Emerging and Selected Topics in Circuits and Systems  
We propose a variability-aware OpenMP (VOMP) programming environment, suitable for tightly-coupled shared memory processor clusters, that relies upon modeling across the hardware/software interface.  ...  Manufacturing and environmental variations cause timing errors in microelectronic processors that are typically avoided by ultra-conservative multi-corner design margins or corrected by error detection  ...  Differentiating the actual work done by different processors in OpenMP is achieved by means of work-sharing constructs: , and .  ... 
doi:10.1109/jetcas.2014.2315883 fatcat:emu6fpxpxreyjihdgyxv7xhhde

RISC-V: #AlphanumericShellcoding [article]

Hadrien Barral, Rémi Géraud-Stewart, Georges-Axel Jaloyan, and David Naccache
2019 arXiv   pre-print
We explain how to design RISC-V shellcodes capable of running arbitrary code, whose ASCII binary representation use only letters a-zA-Z, digits 0-9, and either of the three characters: #, /, '.  ...  While only test boards feature RISC-V processors for now, many companies including Western Digital or Nvidia have announced the use of RISC-V chips in their future products [19] .  ...  RISC-V ELF psABI specification [6] provides a register naming convention, reproduced in Table 1 Alphanumeric RISC-V The first step towards building an alphanumeric shellcode for RV64GC consists in  ... 
arXiv:1908.03819v1 fatcat:cbfl7exwdfdq7h5vz6z2jrrl5a

TCN Mapping Optimization for Ultra-Low Power Time-Series Edge Inference [article]

Alessio Burrello, Alberto Dequino, Daniele Jahier Pagliari, Francesco Conti, Marcello Zanghieri, Enrico Macii, Luca Benini, Massimo Poncino
2022 arXiv   pre-print
Temporal Convolutional Networks (TCNs) are emerging lightweight Deep Learning models for Time Series analysis.  ...  EXPERIMENTAL RESULTS AND DISCUSSION We benchmark our toolkit on GAP-8 [3] , a commercial PULP SoC including a control RISC-V processor (fabric controller) and a cluster of 8 additional RISC-V cores.  ...  Similarly, GreenWaves Technologies' GAP8 SoC [3] features one I/O core and an 8-core cluster with a RISC-V Instruction Set Architecture (ISA) extension for enhanced DSP.  ... 
arXiv:2203.12925v1 fatcat:sd7styyqbnhotgvrztujslabtm

Memory-Latency-Accuracy Trade-offs for Continual Learning on a RISC-V Extreme-Edge Node [article]

Leonardo Ravaglia, Manuele Rusci, Alessandro Capotondi, Francesco Conti, Lorenzo Pellegrini, Vincenzo Lomonaco, Davide Maltoni, Luca Benini
2020 arXiv   pre-print
In this work, after quantifying memory and computational requirements of CL algorithms, we define a novel HW/SW extreme-edge platform featuring a low power RISC-V octa-core cluster tailored for on-demand  ...  Thanks to the parallelism of the low-power cluster engine, our HW/SW platform results 25x faster than typical MCU device, on which CL is still impractical, and demonstrates an 11x gain in terms of energy  ...  The architecture consists of a cluster of 8 RISC-V tightly coupled cores featuring private FPUs and two shared L1 and L2 scratchpad memories.cores.  ... 
arXiv:2007.13631v1 fatcat:vnqayw5mjzg25hmfnphrywsoie
« Previous Showing results 1 — 15 out of 352 results