25 Hits in 7.6 sec

An Algorithm-Hardware Co-Design for Bayesian Neural Network Utilizing SOT-MRAM's Inherent Stochasticity

Anni Lu, Yandong Luo, Shimeng Yu
2022 IEEE Journal on Exploratory Solid-State Computational Devices and Circuits  
Until now, there are few hardware considerations to address the intensive computation and true random number generation for Bayesian neural network (BayesNN), whose weights are represented by probability  ...  The evaluation on the CIFAR-10 dataset suggests that BayesNN could achieve comparable accuracy as conventional deep neural network (DNN) with acceptable hardware overhead but provide much better uncertainty  ...  ALGORITHM-HARDWARE CO-DESIGN OVERVIEW A.  ... 
doi:10.1109/jxcdc.2022.3177588 fatcat:jbeu6uxzcfhtpoqrlhgn6e6nxq

SME: ReRAM-based Sparse-Multiplication-Engine to Squeeze-Out Bit Sparsity of Neural Network [article]

Fangxin Liu, Wenbo Zhao, Yilong Zhao, Zongwu Wang, Tao Yang, Zhezhi He, Naifeng Jing, Xiaoyao Liang, Li Jiang
2021 arXiv   pre-print
Resistive Random-Access-Memory (ReRAM) crossbar is a promising technique for deep neural network (DNN) accelerators, thanks to its in-memory and in-situ analog computing abilities for Vector-Matrix Multiplication-and-Accumulations  ...  Second, we propose a novel weigh mapping mechanism to slice the bits of a weight across the crossbars and splice the activation results in peripheral circuits.  ...  We propose an algorithm-hardware co-design framework called SME by novel weight mapping schemes and data path design to squeeze out the bit-wise sparsity.  ... 
arXiv:2103.01705v1 fatcat:uzoslqgegzewhiuso5pj6sddzq

An Ultra-Efficient Memristor-Based DNN Framework with Structured Weight Pruning and Quantization Using ADMM [article]

Geng Yuan, Xiaolong Ma, Caiwen Ding, Sheng Lin, Tianyun Zhang, Zeinab S. Jalali, Yilong Zhao, Li Jiang, Sucheta Soundarajan, Yanzhi Wang
2019 arXiv   pre-print
The high computation and memory storage of large deep neural networks (DNNs) models pose intensive challenges to the conventional Von-Neumann architecture, incurring substantial data movements in the memory  ...  Experimental results show that our proposed framework achieves 29.81X (20.88X) weight compression ratio, with 98.38% (96.96%) and 98.29% (97.47%) power and area reduction on VGG-16 (ResNet-18) network  ...  We thank all anonymous reviewers for their feedback.  ... 
arXiv:1908.11691v1 fatcat:krezi3afofdjxfiv5eohaffxay

FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator [article]

Geng Yuan, Payman Behnam, Zhengang Li, Ali Shafiee, Sheng Lin, Xiaolong Ma, Hang Liu, Xuehai Qian, Mahdi Nazm Bojnordi, Yanzhi Wang, Caiwen Ding
2021 arXiv   pre-print
Instead of trying to represent the positive/negative weights, our key design principle is to enforce exactly what is assumed in the in-situ computation -- ensuring that all weights in the same column of  ...  intensive and key computation in DNNs.  ...  CONCLUSION We propose FORMS, a fine-grained ReRAM-based DNN accelerator with algorithm/hardware co-design optimizations.  ... 
arXiv:2106.09144v1 fatcat:qsn6nmh6u5entbk5qjdm6fkcoe

A Survey of Near-Data Processing Architectures for Neural Networks [article]

Mehdi Hassanpour, Marc Riera, Antonio González
2021 arXiv   pre-print
and in/near-memory computation/search engine.  ...  , and especially neural network (NN)-based accelerators has grown significantly.  ...  it may be important to investigate novel algorithm-hardware to perform accumulations, and although the technology pro- co-designs  ... 
arXiv:2112.12630v1 fatcat:drkwrztkazd3hlblxc7i4kgn2a

Roadmap on emerging hardware and technology for machine learning

Qiangfei Xia, Karl K Berggren, Konstantin Likharev, Dmitri B Strukov, Hao Jiang, Thomas Mikolajick, Damien Querlioz, Martin Salinga, John Erickson, Shuang Pi, Feng Xiong, Peng Lin (+31 others)
2020 Nanotechnology  
Recent progress in artificial intelligence is largely attributed to the rapid development of machine learning, especially in the algorithm and neural network models.  ...  Data-centric computing requires a revolution in hardware systems, since traditional digital computers based on transistors and the von Neumann architecture were not purposely designed for neuromorphic  ...  Acknowledgments We thanks Dr Zhongrui Wang, Dr Peng Lin, Dr Can Li for their help in preparing this section of roadmap. ORCID iDs  ... 
doi:10.1088/1361-6528/aba70f pmid:32679577 fatcat:t6me4pfxgfhdvbdqjnyjuksf2e

Closed-Loop Neural Prostheses with On-Chip Intelligence: A Review and A Low-Latency Machine Learning Model for Brain State Detection [article]

Bingzhao Zhu, Uisub Shin, Mahsa Shoaran
2021 arXiv   pre-print
Next, we review state-of-the-art neural prostheses with on-chip machine learning, focusing on application-specific integrated circuits (ASIC).  ...  The application of closed-loop approaches in systems neuroscience and therapeutic stimulation holds great promise for revolutionizing our understanding of the brain and for developing novel neuromodulation  ...  techniques, accommodating high channel counts and the need for online learning. • Novel algorithm-hardware co-design techniques for nextgeneration energy-efficient neural prostheses.  ... 
arXiv:2109.05848v1 fatcat:m6ib42vqpngb5mal4wkkmuydfm

P2M: A Processing-in-Pixel-in-Memory Paradigm for Resource-Constrained TinyML Applications [article]

Gourav Datta, Souvik Kundu, Zihan Yin, Ravi Teja Lakkireddy, Joe Mathai, Ajey Jacob, Peter A. Beerel, Akhilesh R. Jaiswal
2022 arXiv   pre-print
Our solution includes a holistic algorithm-circuit co-design approach and the resulting P2M paradigm can be used as a drop-in replacement for embedding memory-intensive first few layers of convolutional  ...  neural network (CNN) models within foundry-manufacturable CMOS image sensor platforms.  ...  Acknowledgements We would like to acknowledge the DARPA HR00112190120 award for supporting this work.  ... 
arXiv:2203.04737v2 fatcat:xcwvrkayljaujeygdcnsstf334

Survey on Near-Data Processing: Applications and Architectures

Paulo Cesar Santos, Francis Birck Moreira, Aline Santana Cordeiro, Sairo Raoní Santos, Tiago Rodrigo Kepe, Luigi Carro, Marco Antonio Zanata Alves
2021 Journal of Integrated Circuits and Systems  
It occurred together with the appearance of 3D-stacked chips with logic and memory stacked layers.  ...  One of the main challenges for modern processors is the data transfer between processor and memory. Such data movement implies high latency and high energy consumption.  ...  Mondrian [79, 80] implements an algorithm-hardware co-design using the 3Dstacked memory for NDP of data analytics operators, which also apply to the data partitioning and shuffling phase of MapReduce  ... 
doi:10.29292/jics.v16i2.502 fatcat:3uiswd6z65djpjgvsxclutthxu

Applications and Techniques for Fast Machine Learning in Science [article]

Allison McCarn Deiana, Joshua Agar, Michaela Blott, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Scott Hauck, Mia Liu, Mark S. Neubauer, Jennifer Ngadiuba, Seda Ogrenci-Memik, Maurizio Pierini (+74 others)
2021 arXiv   pre-print
In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing  ...  training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms.  ...  of peripheral circuits [576] .  ... 
arXiv:2110.13041v1 fatcat:cvbo2hmfgfcuxi7abezypw2qrm

FPGA-Based Inter-layer Pipelined Accelerators for Filter-Wise Weight-Balanced Sparse Fully Convolutional Networks with Overlapped Tiling

Masayuki Shimoda, Youki Sada, Hiroki Nakahara
2021 Journal of Signal Processing Systems  
AbstractConvolutional neural networks (CNNs) exhibit state-of-the-art performance while performing computer-vision tasks.  ...  A different method is used to minimize the input image size, for real-time processing, but it causes a considerable drop in accuracy.  ...  Acknowledgements This research is supported in part by the Grants in Aid for Scientistic Research of JSPS, Industry-academia collaborative R&D programs centre of innovation (COI) program, Core Research  ... 
doi:10.1007/s11265-021-01642-6 fatcat:vfcfmfjvyfahphf6xiuy5zagqu

2022 Roadmap on Neuromorphic Computing and Engineering [article]

Dennis V. Christensen, Regina Dittmann, Bernabé Linares-Barranco, Abu Sebastian, Manuel Le Gallo, Andrea Redaelli, Stefan Slesazeck, Thomas Mikolajick, Sabina Spiga, Stephan Menzel, Ilia Valov, Gianluca Milano (+47 others)
2022 arXiv   pre-print
built-in capabilities to learn or deal with complex data as our brain does.  ...  This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exascale with 1018 calculations each second.  ...  Concluding Remarks Integrating event-based vision sensing and processing with neuromorphic computation techniques is expected to yield solutions that will be able to penetrate the artificial vision market  ... 
arXiv:2105.05956v3 fatcat:pqir5infojfpvdzdwgmwdhsdi4

Correlation Power Analysis Attack Resisted Cryptographic RISC-V SoC with Random Dynamic Frequency Scaling Countermeasure

Ba-Anh Dao, Trong-Thuc Hoang, Anh-Tien Le, Akira Tsukamoto, Kuniyasu Suzaki, Cong-Kha Pham
2021 IEEE Access  
In these systems, cryptographic accelerators are integrated with processor cores to provide users with the software's flexibility and hardware's high performance.  ...  The proposed RDFS countermeasure improved the power analysis resistance while maintaining low-performance overhead and hardware costs by generating more than 219,000 distinct frequencies for driving only  ...  In [23] , Benadjila et al. introduced a profiled DL-SCA attack utilizing a convolutional neural network (CNN) that is suitable for attacking highly desynchronized power traces.  ... 
doi:10.1109/access.2021.3126703 fatcat:nirhzlv2q5hyvaeq3fybs7sffi

Improving Reliability, Security, and Efficiency of Reconfigurable Hardware Systems [article]

Daniel Ziener
2018 arXiv   pre-print
In the area of reliability, countermeasures against radiation-induced faults and aging effects for long mission times were investigated and applied to SRAM-FPGA-based satellite systems.  ...  This technique was applied to the acceleration of SQL queries for large database applications as well as for image and signal processing applications.  ...  The research has been carried out in collaboration with several doctoral researchers, master and bachelor students from my research group Reconfigurable Computing. In  ... 
arXiv:1809.11156v1 fatcat:6ttulp2tancyvds7fk2coxoptq

Advancing Neuromorphic Computing With Loihi: A Survey of Results and Outlook

Mike Davies, Andreas Wild, Garrick Orchard, Yulia Sandamirskaya, Gabriel A. Fonseca Guerra, Prasad Joshi, Philipp Plank, Sumedh R. Risbud
2021 Proceedings of the IEEE  
This is now changing with the advent of Intel's Loihi, a neuromorphic research processor designed to support a broad range of spiking neural networks with sufficient scale, performance, and features to  ...  KEYWORDS | Computer architecture; neural network hardware; neuromorphics. I.  ...  We believe that these will be well matched for the unique features of neuromorphic architectures and will come in time with ongoing algorithm-hardware codevelopment.  ... 
doi:10.1109/jproc.2021.3067593 fatcat:krqdmy3u6jdvfl7btjglek5ag4
« Previous Showing results 1 — 15 out of 25 results