8 Hits in 1.9 sec

Functionality-Based Processing-In-Memory Accelerator for Deep Convolutional Neural Networks

Min-Jae Kim, Jeong-Geun Kim, Su-Kyung Yoon, Shin-Dug Kim
2021 IEEE Access  
Fourth, we compose a PIM scheduler for PIM core-level autonomous request control.  ...  The PIM cores subsequently enhance computational utilization and data accessibility.  ...  In GraphPIM, it exploits an offloading mechanism for graph programs' atomic operations. Graph-PIM only considered graph analysis workloads. Kim et al.  ... 
doi:10.1109/access.2021.3122818 fatcat:eer2kddywje7fjugcud5uokulu

NMPO: Near-Memory Computing Profiling and Offloading [article]

Stefano Corda, Madhurya Kumaraswamy, Ahsan Javed Awan, Roel Jordans, Akash Kumar, Henk Corporaal
2021 arXiv   pre-print
This work proposes Near-Memory computing Profiling and Offloading (NMPO), a high-level framework capable of predicting NMC offloading suitability employing an ensemble machine learning model.  ...  Near-memory computing (NMC), a modern data-centric computational paradigm, can alleviate these bottlenecks, thereby improving the performance of applications.  ...  [39] extend GraphPIM [9] propose a compiler-based mechanism for instruction offloading on CPU/GPU-NMC systems. Ahmed et al.  ... 
arXiv:2106.15284v1 fatcat:i23xfrfsjrf4po2sxgg3rivhze

Development of processing-in-memory

Jiwu SHU, Haiyu MAO, Fei LI, Zhe LIU
2021 Scientia Sinica Informationis  
GraphPIM framework Virtual memory space Graph sructure Other data malloc() Graph property PIM mem region pmr_malloc() Core POU Caches Atomic unit HMC Host processor HMC 图 22  ...  在虚拟地址空 间中, GraphPIM 使用传统系统的指令绕过缓存来分配图数据; 在硬件端, GraphPIM 在中央处理器上 加了一个 POU (PIM offloading unit) 用来决定哪些操作放到 NDC cube 中执行. 此外, GraphPIM 充 分分析了图计算应用特征, 判断出哪些部分放到近数据处理端做会有性能提升.  ...  Jiwu SHU was born in 1968. He received his Ph.D. degree in Computer Science from Nanjing University, Nanjing, in 1998.  ... 
doi:10.1360/ssi-2020-0037 fatcat:ltknlty5dzeldiehnd2st5yqja

Near-Memory Computing: Past, Present, and Future [article]

Gagandeep Singh, Lorenzo Chelini, Stefano Corda, Ahsan Javed Awan, Sander Stuijk, Roel Jordans, Henk Corporaal, Albert-Jan Boonstra
2019 arXiv   pre-print
We also provide a glimpse of our approach to near-memory computing that includes i) NMC specific microarchitecture independent application characterization ii) a compiler framework to offload the NMC kernels  ...  At the same time, the advancement in 3D integration technologies has made the decade-old concept of coupling compute units close to the memory --- called near-memory computing (NMC) --- more viable.  ...  ACKNOWLEDGMENT This work was performed in the framework of Horizon 2020 program for the project "Near-Memory Computing (Ne-MeCo)" and is funded by European Commission under Marie Sklodowska-Curie Innovative  ... 
arXiv:1908.02640v1 fatcat:nvppe5zx2vbb5k4phfi4o5qoqq

Survey on Near-Data Processing: Applications and Architectures

Paulo Cesar Santos, Francis Birck Moreira, Aline Santana Cordeiro, Sairo Raoní Santos, Tiago Rodrigo Kepe, Luigi Carro, Marco Antonio Zanata Alves
2021 Journal of Integrated Circuits and Systems  
Such proposals alleviate the memory bottleneck by moving instructions to data whereabouts.  ...  The first proposals date back to the 1990s, but it was only in the 2010s that we could observe an increase in papers addressing NDP.  ...  [54] propose a framework called GraphPIM that efficiently utilizes NDP for graph computing by enabling instructionlevel NDP offloading for generic graph computing frameworks with minor changes in both  ... 
doi:10.29292/jics.v16i2.502 fatcat:3uiswd6z65djpjgvsxclutthxu

CGAcc: A Compressed Sparse Row Representation-Based BFS Graph Traversal Accelerator on Hybrid Memory Cube

Cheng Qian, Bruce Childers, Libo Huang, Hui Guo, Zhiying Wang
2018 Electronics  
In the runtime, CGAcc pipelines the prefetching to fetch data from DRAM arrays to improve memory-level parallelism.  ...  To further reduce the access latency, several optimized internal caches are also introduced to hold the prefetched data to be Processed In-Memory (PIM).  ...  This computation is a simple form of Processing In-Memory (PIM) tailored to graphs.  ... 
doi:10.3390/electronics7110307 fatcat:fkp4vgps6zg77ebcnfhfdp7hhm

Continual Learning Approach for Improving the Data and Computation Mapping in Near-Memory Processing System [article]

Pritam Majumder, Jiayi Huang, Sungkeun Kim, Abdullah Muzahid, Dylan Siegers, Chia-Che Tsai, Eun Jung Kim
2021 arXiv   pre-print
Along with NMP and memory system development, the mapping for placing data and guiding computation in the memory-cube network has become crucial in driving the performance improvement in NMP.  ...  In this paper, we propose an artificially intelligent memory mapping scheme, AIMM, that optimizes data placement and resource utilization through page and computation remapping.  ...  PIM-Enabled Instruction (PEI) [5] , for instance, offloads instructions from the CPU to memory using the extended ISA.  ... 
arXiv:2104.13671v1 fatcat:fe2slbojkndufibfikgxpajqqe

NERO: A Near High-Bandwidth Memory Stencil Accelerator for Weather Prediction Modeling [article]

Gagandeep Singh, Dionysios Diamantopoulos, Christoph Hagleitner, Juan Gomez-Luna, Sander Stuijk, Onur Mutlu, Henk Corporaal
2020 arXiv   pre-print
We focus on compound stencils that are fundamental kernels in weather prediction models.  ...  By using high-level synthesis techniques, we develop NERO, an FPGA+HBM-based accelerator connected through IBM CAPI2 (Coherent Accelerator Processor Interface) to an IBM POWER9 host system.  ...  Nai et al., “GraphPIM: Enabling Instruction-Level PIM Offloading in Graph [98] P.-A. Tsai et al., “Jenga: Software-Defined Cache Hierarchies,” in ISCA, 2017.  ... 
arXiv:2009.08241v1 fatcat:mj6kwwbhhne5xa7y2ilppi3ora