Filters








8,281 Hits in 4.8 sec

Scalable communications for a million-core neural processing architecture

Cameron Patterson, Jim Garside, Eustace Painkras, Steve Temple, Luis A. Plana, Javier Navaridas, Thomas Sharp, Steve Furber
2012 Journal of Parallel and Distributed Computing  
The design of a new high-performance computing platform to model biological neural networks requires scalable, layered communications in both hardware and software.  ...  The architecture scales from a single 18-processor chip to over 1 million processors and to simulations of billion-neuron, trillion-synapse models, with tens of trillions of neural spike-event packets  ...  The scalable SpiNNaker architecture is also garnering interest for use in applications beyond the purely spiking neural space.  ... 
doi:10.1016/j.jpdc.2012.01.016 fatcat:szz343lkvvb4rcpvfek4oof2fa

Real-Time Scalable Cortical Computing at 46 Giga-Synaptic OPS/Watt with ~100× Speedup in Time-to-Solution and ~100,000× Reduction in Energy-to-Solution

Andrew S. Cassidy, Rodrigo Alvarez-Icaza, Filipp Akopyan, Jun Sawada, John V. Arthur, Paul A. Merolla, Pallab Datta, Marc Gonzalez Tallada, Brian Taba, Alexander Andreopoulos, Arnon Amir, Steven K. Esser (+26 others)
2014 SC14: International Conference for High Performance Computing, Networking, Storage and Analysis  
Breaking path with the von Neumann architecture, TrueNorth is a 4,096 core, 1 million neuron, and 256 million synapse brain-inspired neurosynaptic processor, that consumes 65mW of power running at real-time  ...  We demonstrate seamless tiling of TrueNorth chips into arrays, forming a foundation for cortex-like scalability.  ...  The primary advantage of using a core is that it overcomes a key communication bottleneck that limits scalability for large scale network simulations.  ... 
doi:10.1109/sc.2014.8 dblp:conf/sc/CassidyAASAMDTTAAEKAHBMBBMSCIMIMNVGNLAFJFRMM14 fatcat:6ujlnfomhzd3bcyfxyjj25pwgm

POETS: A Parallel Cluster Architecture for Spiking Neural Network

Mahyar Shahsavari, Department of Electrical and Electronic Engineering, Imperial College London, UK, Jonathan Beaumont, David Thomas, Andrew D. Brown
2021 International Journal of Machine Learning and Computing  
The current system consists of 48 FPGAs, providing 3072 processing cores and 49152 threads. We use this hardware to implement up to four million neurons with one thousand synapses.  ...  This work presents a highly-scalable hardware platform called POETS, and uses it to implement SNN on a very large number of parallel and reconfigurable FPGA-based processors.  ...  In this work, we focused on scalability and architecture of system, while for future work we will investigate the accuracy of learning and neural network capabilities of POETS.  ... 
doi:10.18178/ijmlc.2021.11.4.1048 fatcat:znyzi4g735dudj4o2lxibcwppe

Benchmarking a Many-Core Neuromorphic Platform With an MPI-Based DNA Sequence Matching Algorithm

Gianvito Urgese, Francesco Barchi, Emanuele Parisi, Evelina Forno, Andrea Acquaviva, Enrico Macii
2019 Electronics  
SpiNNaker is a neuromorphic globally asynchronous locally synchronous (GALS) multi-core architecture designed for simulating a spiking neural network (SNN) in real-time.  ...  Experimental results indicate that the SpiNNaker parallel architecture allows a linear performance increase with the number of used cores and shows better scalability compared to a general-purpose multi-core  ...  Abbreviations The following abbreviations are used in this manuscript: SpiNNaker Spiking Neural Network Architecture Dynap-SEL Dynamic Asynchronous Processor Scalable and Learning BrainScaleS Brain-inspired  ... 
doi:10.3390/electronics8111342 fatcat:yjsmlxwqtrh2pht53mcz3wux2e

A comparative study of GPU programming models and architectures using neural networks

Vivek K. Pallipuram, Mohammad Bhuiyan, Melissa C. Smith
2011 Journal of Supercomputing  
There has been a strong interest in the neuroscience community to model a mammalian brain in order to study its architecture and functional principles.  ...  Spiking Neural Network (SNN) models have been widely employed to simulate the mammalian brain, capturing its functionality and inference capabilities.  ...  used a 2.66 GHz Intel Core 2 Quad host processor coupled with Nvidia's state-of-the-art Fermi architecture and Compute Unified Device Architecture (CUDA) as the programming model [5].  ... 
doi:10.1007/s11227-011-0631-3 fatcat:wq6xwp5panbnzenl7z2cib2gmi

SpiNNaker: A multi-core System-on-Chip for massively-parallel neural net simulation

Eustace Painkras, Luis A. Plana, Jim Garside, Steve Temple, Simon Davidson, Jeffrey Pepper, David Clark, Cameron Patterson, Steve Furber
2012 Proceedings of the IEEE 2012 Custom Integrated Circuits Conference  
The MPSoC contains 100 million transistors in a 102 mm 2 die, provides a peak performance of 3.96 GIPS and has a power consumption of 1W at 1.2V when all processor cores operate at nominal frequency.  ...  The modelling of large systems of spiking neurons is computationally very demanding in terms of processing power and communication.  ...  Experimental results show that, for massively-parallel neural net simulations, a customized multi-core architecture can be Though SpiNNaker is an application-specific architecture, it can be used as a  ... 
doi:10.1109/cicc.2012.6330636 dblp:conf/cicc/PainkrasPGTDPCPF12 fatcat:cm5i4u3wa5ghffa52nxeqrynwa

A parallel computing platform for training large scale neural networks

Rong Gu, Furao Shen, Yihua Huang
2013 2013 IEEE International Conference on Big Data  
Third, we choose a compact, event-driven messaging communication model instead of the heartbeat polling model for instant messaging delivery.  ...  Unlike many existing parallel neural network training systems working on thousands of training samples, cNeural is designed for fast training large scale datasets with millions of training samples.  ...  This kind of approaches is suitable to be implemented over the multi-core or many-core architectures which have low communication cost.  ... 
doi:10.1109/bigdata.2013.6691598 dblp:conf/bigdataconf/GuSH13 fatcat:57qxdszarrasbinrgbsofmilay

SpiNNaker: Mapping neural networks onto a massively-parallel chip multiprocessor

M.M. Khan, D.R. Lester, L.A. Plana, A. Rast, X. Jin, E. Painkras, S.B. Furber
2008 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)  
SpiNNaker is a novel chip -based on the ARM processor -which is designed to support large scale spiking neural networks simulations.  ...  Our eventual goal is to be able to simulate neural networks consisting of 10 9 neurons running in 'real time', by which we mean that a similarly sized collection of biological neurons would run at the  ...  Steve Furber holds a Royal Society-Wolfson Research Merit Award. We appreciate the support of these sponsors and industrial partners.  ... 
doi:10.1109/ijcnn.2008.4634199 dblp:conf/ijcnn/KhanLPRJPF08 fatcat:w6jacfvycrajxead7t5vedmjue

Podracer architectures for scalable Reinforcement Learning [article]

Matteo Hessel, Manuel Kroiss, Aidan Clark, Iurii Kemaev, John Quan, Thomas Keck, Fabio Viola, Hado van Hasselt
2021 arXiv   pre-print
In this report we argue that TPUs are particularly well suited for training RL agents in a scalable, efficient and reproducible way.  ...  Specifically we describe two architectures designed to make the best use of the resources available on a TPU Pod (a special configuration in a Google data center that features multiple TPU devices connected  ...  Also, thanks the JAX team for the amazing JAX library that made implementing all these architectures easy and fun!  ... 
arXiv:2104.06272v1 fatcat:b2r6vt6w6rdc3j43ldxqipkzgi

Real-Time Simulation of Passage-of-Time Encoding in Cerebellum Using a Scalable FPGA-Based System

Junwen Luo, Graeme Coapes, Terrence Mak, Tadashi Yamazaki, Chung Tin, Patrick Degenaar
2016 IEEE Transactions on Biomedical Circuits and Systems  
In this paper, we present a frame-based network-on-chip (NoC) hardware architecture for implementing a bio-realistic cerebellum model with neurons, which has been used for studying timing control or passage-of-time  ...  The cerebellum plays a critical role for sensorimotor control and learning.  ...  Finally, a frame master is implemented to coordinate neural and communication processing periods. A. Neural Computing The neural processor data path is shown in Fig. 4 .  ... 
doi:10.1109/tbcas.2015.2460232 pmid:26452290 fatcat:qh24oq67d5dzfocolwcwmarvi4

Real-Time Cortical Simulations: Energy and Interconnect Scaling on Distributed Systems

Francesco Simula, Elena Pastorelli, Pier Stanislao Paolucci, Michele Martinelli, Alessandro Lonardo, Andrea Biagioni, Cristiano Capone, Fabrizio Capuani, Paolo Cretaro, Giulia De Bonis, Francesca Lo Cicero, Luca Pontisso (+2 others)
2019 2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)  
with a dedicated interconnect scalable to million of cores; simulation of deep sleep Slow Wave Activity (SWA) and Asynchronous aWake (AW) regimes expressed by thalamo-cortical models are among their benchmarks  ...  Reaching efficient real-time on large scale cortical simulations is of increasing relevance for both future bio-inspired artificial intelligence applications and for understanding the cognitive functions  ...  uses distinct executables for different architectures; in this way, the simulation of the neural network is split between partitions of processes executing on ARM and Intel cores.  ... 
doi:10.1109/empdp.2019.8671627 dblp:conf/pdp/SimulaPPMLBCCCB19 fatcat:i3xrixafdne6leffjvaery4ydu

Real-time cortical simulations: energy and interconnect scaling on distributed systems [article]

Francesco Simula, Elena Pastorelli, Pier Stanislao Paolucci, Michele Martinelli, Alessandro Lonardo, Andrea Biagioni, Cristiano Capone, Fabrizio Capuani, Paolo Cretaro, Giulia De Bonis, Francesca Lo Cicero, Luca Pontisso, Piero Vicini (+1 others)
2019 arXiv   pre-print
with a dedicated interconnect scalable to million of cores; simulation of deep sleep Slow Wave Activity (SWA) and Asynchronous aWake (AW) regimes expressed by thalamo-cortical models are among their benchmarks  ...  Reaching efficient real-time on large scale cortical simulations is of increasing relevance for both future bio-inspired artificial intelligence applications and for understanding the cognitive functions  ...  uses distinct executables for different architectures; in this way, the simulation of the neural network is split between partitions of processes executing on ARM and Intel cores.  ... 
arXiv:1812.04974v3 fatcat:bsb2o6jrvzgb7d4xbqa3xbqiu4

Enabling Large-Scale Simulations With the GENESIS Neuronal Simulator

Joshua C. Crone, Manuel M. Vindiola, Alfred B. Yu, David L. Boothe, David Beeman, Kelvin S. Oie, Piotr J. Franaszczuk
2019 Frontiers in Neuroinformatics  
In this paper, we evaluate the computational performance of the GEneral NEural SImulation System (GENESIS) for large scale simulations of neural networks.  ...  While many benchmark studies have been performed for large scale simulations with leaky integrate-and-fire neurons or neuronal models with only a few compartments, this work focuses on higher fidelity  ...  They optimized the compute engine of NEURON for modern multi-core computing architectures and examined the parallel scalability for models of varying complexity.  ... 
doi:10.3389/fninf.2019.00069 pmid:31803040 pmcid:PMC6873326 fatcat:jiy4gfscpreu7dnnwmugx3t5ai

A Survey on Parallelization of Neural Network using MPI and Open MP

P. Chanthini, K. Shyamala
2016 Indian Journal of Science and Technology  
In human brain millions of neurons form a massively parallel information system. Method: A neural network is a parallel and distributed process.  ...  In modern Microprocessor number of cores is rapidly increasing. So high-performance computing is a great challenge for the application developers.  ...  The authors focused on HPC 24 :Shared memory nodes with several multi-core CPIs are communicated via a network infrastructure.  ... 
doi:10.17485/ijst/2016/v9i19/93835 fatcat:mtry5jqn75hl5oe6zan7juz3nu

The impact of on-chip communication on memory technologies for neuromorphic systems

Saber Moradi, Rajit Manohar
2018 Journal of Physics D: Applied Physics  
Emergent nanoscale non-volatile memory technologies with high integration density offer a promising solution to overcome the scalability limitations of CMOS-based neural networks architectures, by efficiently  ...  We present existing approaches for on-chip neuromorphic routing networks, and discuss how new memory and integration technologies may help to alleviate the communication issues in constructing next-generation  ...  The communication architecture is also mesh based. Loihi is a fully digital architecture implemented in 14 nm CMOS process, and its routing network is designed using asynchronous circuits.  ... 
doi:10.1088/1361-6463/aae641 fatcat:kw6alqj6grdlpddvrtocenxwfm
« Previous Showing results 1 — 15 out of 8,281 results