90 Hits in 5.5 sec

τ C: C with Process Network Extensions for Embedded Manycores

Thierry Goubier, Damien Couroussé, Selma Azaiez
2014 Procedia Computer Science  
In this paper, we present a process network extension to C called τ C and its mapping to both a POSIX target and the P2012/STHORM platform, and show how the language offers an architecture independent  ...  Current and future embedded manycore systems bring complex and heterogeneous architectures with a large number of processing cores, making both parallel programming at this scale and understanding the  ...  embedded manycore τ C: C with Process Network Extensions for Embedded Manycores Goubier, Couroussé and Azaiez University's graduate school.τ C: C with Process Network Extensions for Embedded Manycores  ... 
doi:10.1016/j.procs.2014.05.099 fatcat:2g6ene4r7rakhixk55klsvp26i

A Short Overview of Executing Γ Chemical Reactions over the ΣC and τ C Dataflow Programming Models

Loïc Cudennec, Thierry Goubier
2015 Procedia Computer Science  
A preliminary implementation of the chemical reaction mechanisms is provided using the τ C dataflow compilation toolchain, a language close to ΣC, in order to demonstrate the relevance of the proposition  ...  Today, they enter both high-performance computing systems, as well as embedded systems.  ...  The C programming model is based on networks of connected agents. An agent is an autonomous entity, with its own address space and thread of control.  ... 
doi:10.1016/j.procs.2015.05.349 fatcat:aa4tn6wmynhknc2ox2o2ixpjfu

An OpenMP backend for the ΣC streaming language

Stéphane Louise
2017 Procedia Computer Science  
The ΣC (pronounced "Sigma-C") language is a general purpose data-flow language that was initially targeted for Kalray's MPPA embedded many-core processor.  ...  It is designed as an extension of C, allowing the Cyclo-Static Data-Flow (CSDF) model of computation. Until now, it was only available for the first generation of the MPPA chip.  ...  One of the base model is Kahn Process Networks (KPN), defined by G. Kahn in 1974 [11] .  ... 
doi:10.1016/j.procs.2017.05.251 fatcat:d2utmtvgjfa6xneh5eorrgmvcu

A greedy approach to tolerate defect cores for multimedia applications

Ke Yue, Soumia Ghalim, Zheng Li, Frank Lockom, Shangping Ren, Lei Zhang, Xiaowei Li
2011 2011 9th IEEE Symposium on Embedded Systems for Real-Time Multimedia  
SoC often integrates tens of cores and uses Network-on-Chip (NoC) as its communication infrastructure.  ...  To ensure high yield of manycore processors, core-level redundancy is often used as an effective approach to improve the reliability of manycore chips.  ...  First, the manycore system is a 2-D mesh network running under XY routing.  ... 
doi:10.1109/estimedia.2011.6088517 dblp:conf/estimedia/YueGLLRZL11 fatcat:zz55uetylzax7pxcygrdj4fh6y

Economic Analysis of Testing Homogeneous Manycore Chips

Lin Huang, Qiang Xu
2010 IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  
In this paper, we develop novel analytical models to study the above tradeoff and we verify the effectiveness of the proposed test economics model for hypothetical manycore chips with various configurations  ...  To ensure the product quality of such complex integrated circuits before shipping them to final users, extensive manufacturing tests are necessary and the associated test cost can account for a large share  ...  Acknowledgment The authors would like to thank the anonymous reviewers for their constructive comments.  ... 
doi:10.1109/tcad.2010.2049052 fatcat:wezk4oskorbulo7armfi3g6tpy

Design of Many-Core Big Little μBrain for Energy-Efficient Embedded Neuromorphic Computing [article]

M. Lakshmi Varshika, Adarsha Balaji, Federico Corradi, Anup Das, Jan Stuijt, Francky Catthoor
2021 arXiv   pre-print
As spiking-based deep learning inference applications are increasing in embedded systems, these systems tend to integrate neuromorphic accelerators such as μBrain to improve energy efficiency.  ...  We evaluate the proposed big little many-core neuromorphic design and the system software framework with five commonlyused SDCNN inference applications and show that the proposed solution reduces energy  ...  Extension of SentryC to Other Spiking Architectures 1) Extension to DYNAPs: DYNAPs is a crossbar-based twolayer architecture with fixed number of l0 and l1 neurons.  ... 
arXiv:2111.11838v1 fatcat:hd5hzgth5zhp7nhy5y2vcoajmu

Optimizing Coherence Traffic in Manycore Processors using Closed-Form Caching/Home Agent Mappings

Steve Kommrusch, Marcos Horro, Louis-Noel Pouchet, Gabriel Rodriguez, Juan Tourino
2021 IEEE Access  
INDEX TERMS Network-on-chip, manycores, coherence traffic, distributed directories, architectural discovery, reverse engineering.  ...  The distributed coherence subsystem must be queried for every out-of-tile access, imposing an overhead on memory latency.  ...  ACKNOWLEDGMENT The authors wish to thank John McCalpin for his invaluable insights into the KNL architecture.  ... 
doi:10.1109/access.2021.3058280 fatcat:zd26nmknuvfpbi5mtvhj6zz2ry

Dynamic Resource Allocation in Embedded, High-Performance and Cloud Computing [chapter]

Leandro Soares, Piotr Dziurzanski, Amit Kumar Singh
2016 Dynamic Resource Allocation in Embedded, High-Performance and Cloud Computing  
Based on network and communication science, we further extend the scope for 21st Century life through the knowledge in robotics, machine learning, embedded systems, cognitive science, pattern recognition  ...  The books provide professionals, researchers, educators, and advanced students in the field with an invaluable insight into the latest research and developments.  ...  The authors would like to acknowledge and thank the commission for the funding, as well as the project officers and reviewers for their suggestions and feedback on the project outcomes.  ... 
doi:10.13052/rp-9788793519077 fatcat:fw4oj6baufenlc6n2idp5rhwv4

Coherence Traffic in Manycore Processors with Opaque Distributed Directories [article]

Steve Kommrusch, Marcos Horro, Louis-Noël Pouchet, Gabriel Rodríguez, Juan Touriño
2020 arXiv   pre-print
Manycore processors feature a high number of general-purpose cores designed to work in a multithreaded fashion. Recent manycore processors are kept coherent using scalable distributed directories.  ...  The distributed coherence subsystem must be queried for every out-of-tile access, imposing an overhead on memory latency.  ...  In the proposed sub-NUMA schedule a processor located in quadrant c 1 c 0 will process only memory blocks with associated CHA in the same quadrant.  ... 
arXiv:2011.05422v1 fatcat:277m4su4bnholla6fiv4vwgx54

InterNoC: Unified Deterministic Communication For Distributed NoC-based Man y-Core

Eleftherios Kyriakakis, Jens Sparso, Martin Schoeberl
2019 Zenodo  
ACKNOWLEDGEMENTS This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 764785, FORA-Fog Computing for  ...  Embedded systems engineers need to reason about more than just functional correctness of applications; they also need to reason about energy, time and security (ETS).  ...  CUDA CUDA is a C/C++ extension developed by NVIDIA to allow programmers to submit instructions to the GPU [8] .  ... 
doi:10.5281/zenodo.5851207 fatcat:rpvcg7lr6vfvheufpbfxl2kpv4

A generalized software framework for accurate and efficient management of performance goals

Henry Hoffmann, Martina Maggio, Marco D. Santambrogio, Alberto Leva, Anant Agarwal
2013 2013 Proceedings of the International Conference on Embedded Software (EMSOFT)  
PTRADE's generality is demonstrated through the management of performance goals for a variety of benchmarks on two different Linux/x86 systems and a simulated 128-core system, each with different components  ...  PTRADE can be deployed to work on a new system with different components without redesign and reimplementation.  ...  · cj + τ idle · c idle .  ... 
doi:10.1109/emsoft.2013.6658597 dblp:conf/emsoft/HoffmannMSLA13 fatcat:35xuzqopqjgltpuf4v4ydtweka

Massively parallel non-stationary EEG data processing on GPGPU platforms with Morlet continuous wavelet transform

Ze Deng, Dan Chen, Yangyang Hu, Xiaoming Wu, Weizhou Peng, Xiaoli Li
2012 Journal of Internet Services and Applications  
Nowadays, the MCWT application for processing EEG data is time-sensitive and data-intensive due to quickly increasing problem domain sizes and advancing experimental techniques.  ...  Extensive experiments have been carried out on Fermi and Kepler GPUs and a Fermi GPU cluster.  ...  Let N c be the number of cores in one GPU card. T GPU consists of two parts. One part is the parallel execution time using GPU for processing some data channel with the maximum time cost.  ... 
doi:10.1007/s13174-012-0071-1 fatcat:prwsxt5lwva5rapbrpvpmsweoy

Efficient homology computations on multicore and manycore systems

N. Anurag Murty, Vijay Natarajan, Sathish Vadhiyar
2013 20th Annual International Conference on High Performance Computing  
The second algorithm is based on a novel approach for homology computations on manycore/GPU architectures.  ...  This GPU algorithm is memory efficient and capable of extremely fast computation of homology for simplicial complexes with millions of simplices.  ...  For instance, a triangle with vertices A, B, and C can be represented as [A, B, C].  ... 
doi:10.1109/hipc.2013.6799139 dblp:conf/hipc/MurtyNV13 fatcat:ulbaogjwkrfvve2rtov5llhj6i

Models of Architecture for DSP Systems [chapter]

Maxime Pelcat
2018 Handbook of Signal Processing Systems  
Over the last decades, the practice of representing digital signal processing applications with formal Models of Computation (MoCs) has developed.  ...  On the architectural side of digital signal processing system development, heterogeneous systems are becoming ever more complex.  ...  Acknowledgements I am grateful to François Berry and Jocelyn Sérot for their valuable advice and support during the writing of this chapter.  ... 
doi:10.1007/978-3-319-91734-4_30 fatcat:p3a6oyvo2vbv3lx3xl5j6sum3a

Fast and energy-efficient neuromorphic deep learning with first-spike times [article]

Julian Göltz, Laura Kriener, Andreas Baumbach, Sebastian Billaudelle, Oliver Breitwieser, Benjamin Cramer, Dominik Dold, Akos Ferenc Kungl, Walter Senn, Johannes Schemmel, Karlheinz Meier, Mihai Alexandru Petrovici
2021 arXiv   pre-print
Here, we describe a rigorous derivation of a learning rule for such first-spike times in networks of leaky integrate-and-fire neurons, relying solely on input and output spike times, and show how this  ...  mechanism can implement error backpropagation in hierarchical spiking networks.  ...  Loihi: A neuromorphic manycore pro- cessor with on-chip learning. IEEE Micro 38, 82-99 (2018). 50. Mayr, C., Hoeppner, S. & Furber, S.  ... 
arXiv:1912.11443v4 fatcat:ks4rhovwsvejvnzdb5ykc5doba
« Previous Showing results 1 — 15 out of 90 results