A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Scalable communications for a million-core neural processing architecture
2012
Journal of Parallel and Distributed Computing
The design of a new high-performance computing platform to model biological neural networks requires scalable, layered communications in both hardware and software. ...
The architecture scales from a single 18-processor chip to over 1 million processors and to simulations of billion-neuron, trillion-synapse models, with tens of trillions of neural spike-event packets ...
The scalable SpiNNaker architecture is also garnering interest for use in applications beyond the purely spiking neural space. ...
doi:10.1016/j.jpdc.2012.01.016
fatcat:szz343lkvvb4rcpvfek4oof2fa
Real-Time Scalable Cortical Computing at 46 Giga-Synaptic OPS/Watt with ~100× Speedup in Time-to-Solution and ~100,000× Reduction in Energy-to-Solution
2014
SC14: International Conference for High Performance Computing, Networking, Storage and Analysis
Breaking path with the von Neumann architecture, TrueNorth is a 4,096 core, 1 million neuron, and 256 million synapse brain-inspired neurosynaptic processor, that consumes 65mW of power running at real-time ...
We demonstrate seamless tiling of TrueNorth chips into arrays, forming a foundation for cortex-like scalability. ...
The primary advantage of using a core is that it overcomes a key communication bottleneck that limits scalability for large scale network simulations. ...
doi:10.1109/sc.2014.8
dblp:conf/sc/CassidyAASAMDTTAAEKAHBMBBMSCIMIMNVGNLAFJFRMM14
fatcat:6ujlnfomhzd3bcyfxyjj25pwgm
POETS: A Parallel Cluster Architecture for Spiking Neural Network
2021
International Journal of Machine Learning and Computing
The current system consists of 48 FPGAs, providing 3072 processing cores and 49152 threads. We use this hardware to implement up to four million neurons with one thousand synapses. ...
This work presents a highly-scalable hardware platform called POETS, and uses it to implement SNN on a very large number of parallel and reconfigurable FPGA-based processors. ...
In this work, we focused on scalability and architecture of system, while for future work we will investigate the accuracy of learning and neural network capabilities of POETS. ...
doi:10.18178/ijmlc.2021.11.4.1048
fatcat:znyzi4g735dudj4o2lxibcwppe
Benchmarking a Many-Core Neuromorphic Platform With an MPI-Based DNA Sequence Matching Algorithm
2019
Electronics
SpiNNaker is a neuromorphic globally asynchronous locally synchronous (GALS) multi-core architecture designed for simulating a spiking neural network (SNN) in real-time. ...
Experimental results indicate that the SpiNNaker parallel architecture allows a linear performance increase with the number of used cores and shows better scalability compared to a general-purpose multi-core ...
Abbreviations The following abbreviations are used in this manuscript: SpiNNaker Spiking Neural Network Architecture Dynap-SEL Dynamic Asynchronous Processor Scalable and Learning BrainScaleS Brain-inspired ...
doi:10.3390/electronics8111342
fatcat:yjsmlxwqtrh2pht53mcz3wux2e
A comparative study of GPU programming models and architectures using neural networks
2011
Journal of Supercomputing
There has been a strong interest in the neuroscience community to model a mammalian brain in order to study its architecture and functional principles. ...
Spiking Neural Network (SNN) models have been widely employed to simulate the mammalian brain, capturing its functionality and inference capabilities. ...
used a 2.66 GHz Intel Core 2 Quad host processor coupled with Nvidia's state-of-the-art Fermi architecture and Compute Unified Device Architecture (CUDA) as the programming model [5]. ...
doi:10.1007/s11227-011-0631-3
fatcat:wq6xwp5panbnzenl7z2cib2gmi
SpiNNaker: A multi-core System-on-Chip for massively-parallel neural net simulation
2012
Proceedings of the IEEE 2012 Custom Integrated Circuits Conference
The MPSoC contains 100 million transistors in a 102 mm 2 die, provides a peak performance of 3.96 GIPS and has a power consumption of 1W at 1.2V when all processor cores operate at nominal frequency. ...
The modelling of large systems of spiking neurons is computationally very demanding in terms of processing power and communication. ...
Experimental results show that, for massively-parallel neural net simulations, a customized multi-core architecture can be Though SpiNNaker is an application-specific architecture, it can be used as a ...
doi:10.1109/cicc.2012.6330636
dblp:conf/cicc/PainkrasPGTDPCPF12
fatcat:cm5i4u3wa5ghffa52nxeqrynwa
A parallel computing platform for training large scale neural networks
2013
2013 IEEE International Conference on Big Data
Third, we choose a compact, event-driven messaging communication model instead of the heartbeat polling model for instant messaging delivery. ...
Unlike many existing parallel neural network training systems working on thousands of training samples, cNeural is designed for fast training large scale datasets with millions of training samples. ...
This kind of approaches is suitable to be implemented over the multi-core or many-core architectures which have low communication cost. ...
doi:10.1109/bigdata.2013.6691598
dblp:conf/bigdataconf/GuSH13
fatcat:57qxdszarrasbinrgbsofmilay
SpiNNaker: Mapping neural networks onto a massively-parallel chip multiprocessor
2008
2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)
SpiNNaker is a novel chip -based on the ARM processor -which is designed to support large scale spiking neural networks simulations. ...
Our eventual goal is to be able to simulate neural networks consisting of 10 9 neurons running in 'real time', by which we mean that a similarly sized collection of biological neurons would run at the ...
Steve Furber holds a Royal Society-Wolfson Research Merit Award. We appreciate the support of these sponsors and industrial partners. ...
doi:10.1109/ijcnn.2008.4634199
dblp:conf/ijcnn/KhanLPRJPF08
fatcat:w6jacfvycrajxead7t5vedmjue
Podracer architectures for scalable Reinforcement Learning
[article]
2021
arXiv
pre-print
In this report we argue that TPUs are particularly well suited for training RL agents in a scalable, efficient and reproducible way. ...
Specifically we describe two architectures designed to make the best use of the resources available on a TPU Pod (a special configuration in a Google data center that features multiple TPU devices connected ...
Also, thanks the JAX team for the amazing JAX library that made implementing all these architectures easy and fun! ...
arXiv:2104.06272v1
fatcat:b2r6vt6w6rdc3j43ldxqipkzgi
Real-Time Simulation of Passage-of-Time Encoding in Cerebellum Using a Scalable FPGA-Based System
2016
IEEE Transactions on Biomedical Circuits and Systems
In this paper, we present a frame-based network-on-chip (NoC) hardware architecture for implementing a bio-realistic cerebellum model with neurons, which has been used for studying timing control or passage-of-time ...
The cerebellum plays a critical role for sensorimotor control and learning. ...
Finally, a frame master is implemented to coordinate neural and communication processing periods.
A. Neural Computing The neural processor data path is shown in Fig. 4 . ...
doi:10.1109/tbcas.2015.2460232
pmid:26452290
fatcat:qh24oq67d5dzfocolwcwmarvi4
Real-Time Cortical Simulations: Energy and Interconnect Scaling on Distributed Systems
2019
2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)
with a dedicated interconnect scalable to million of cores; simulation of deep sleep Slow Wave Activity (SWA) and Asynchronous aWake (AW) regimes expressed by thalamo-cortical models are among their benchmarks ...
Reaching efficient real-time on large scale cortical simulations is of increasing relevance for both future bio-inspired artificial intelligence applications and for understanding the cognitive functions ...
uses distinct executables for different architectures; in this way, the simulation of the neural network is split between partitions of processes executing on ARM and Intel cores. ...
doi:10.1109/empdp.2019.8671627
dblp:conf/pdp/SimulaPPMLBCCCB19
fatcat:i3xrixafdne6leffjvaery4ydu
Real-time cortical simulations: energy and interconnect scaling on distributed systems
[article]
2019
arXiv
pre-print
with a dedicated interconnect scalable to million of cores; simulation of deep sleep Slow Wave Activity (SWA) and Asynchronous aWake (AW) regimes expressed by thalamo-cortical models are among their benchmarks ...
Reaching efficient real-time on large scale cortical simulations is of increasing relevance for both future bio-inspired artificial intelligence applications and for understanding the cognitive functions ...
uses distinct executables for different architectures; in this way, the simulation of the neural network is split between partitions of processes executing on ARM and Intel cores. ...
arXiv:1812.04974v3
fatcat:bsb2o6jrvzgb7d4xbqa3xbqiu4
Enabling Large-Scale Simulations With the GENESIS Neuronal Simulator
2019
Frontiers in Neuroinformatics
In this paper, we evaluate the computational performance of the GEneral NEural SImulation System (GENESIS) for large scale simulations of neural networks. ...
While many benchmark studies have been performed for large scale simulations with leaky integrate-and-fire neurons or neuronal models with only a few compartments, this work focuses on higher fidelity ...
They optimized the compute engine of NEURON for modern multi-core computing architectures and examined the parallel scalability for models of varying complexity. ...
doi:10.3389/fninf.2019.00069
pmid:31803040
pmcid:PMC6873326
fatcat:jiy4gfscpreu7dnnwmugx3t5ai
A Survey on Parallelization of Neural Network using MPI and Open MP
2016
Indian Journal of Science and Technology
In human brain millions of neurons form a massively parallel information system. Method: A neural network is a parallel and distributed process. ...
In modern Microprocessor number of cores is rapidly increasing. So high-performance computing is a great challenge for the application developers. ...
The authors focused on HPC 24 :Shared memory nodes with several multi-core CPIs are communicated via a network infrastructure. ...
doi:10.17485/ijst/2016/v9i19/93835
fatcat:mtry5jqn75hl5oe6zan7juz3nu
The impact of on-chip communication on memory technologies for neuromorphic systems
2018
Journal of Physics D: Applied Physics
Emergent nanoscale non-volatile memory technologies with high integration density offer a promising solution to overcome the scalability limitations of CMOS-based neural networks architectures, by efficiently ...
We present existing approaches for on-chip neuromorphic routing networks, and discuss how new memory and integration technologies may help to alleviate the communication issues in constructing next-generation ...
The communication architecture is also mesh based. Loihi is a fully digital architecture implemented in 14 nm CMOS process, and its routing network is designed using asynchronous circuits. ...
doi:10.1088/1361-6463/aae641
fatcat:kw6alqj6grdlpddvrtocenxwfm
« Previous
Showing results 1 — 15 out of 8,281 results