Filters








496 Hits in 2.7 sec

A Survey of Resource Management for Processing-in-Memory and Near-Memory Processing Architectures [article]

Kamil Khan, Sudeep Pasricha, Ryan Gary Kim
2020 arXiv   pre-print
However, application-specific memory access patterns, power and thermal concerns, memory technology limitations, and inconsistent performance gains complicate the offloading of computation in DCC systems  ...  Data-centric computing (DCC), as enabled by processing-in-memory (PIM) and near-memory processing (NMP) paradigms, aims to accelerate these types of applications by moving the computation closer to the  ...  Case Study 1: CoolPIM: Thermal-Aware Source Throttling for Efficient PIM Instruction Offloading As noted in Section 4.1.3, 3D-stacked memory is vulnerable to thermal problems due to high power densities  ... 
arXiv:2009.09603v1 fatcat:aylcbzdsrrdbzgifdqxqs2wcaq

A Survey of Resource Management for Processing-In-Memory and Near-Memory Processing Architectures

Kamil Khan, Sudeep Pasricha, Ryan Gary Kim
2020 Journal of Low Power Electronics and Applications  
However, application-specific memory access patterns, power and thermal concerns, memory technology limitations, and inconsistent performance gains complicate the offloading of computation in DCC systems  ...  Data-centric computing (DCC), as enabled by processing-in-memory (PIM) and near-memory processing (NMP) paradigms, aims to accelerate these types of applications by moving the computation closer to the  ...  • Case Study 1: CoolPIM-Thermal-Aware Source Throttling for Efficient PIM Instruction Offloading As noted in Section 4.1.3, 3D-stacked memory is vulnerable to thermal problems due to high power densities  ... 
doi:10.3390/jlpea10040030 fatcat:yzpcli2ynfe4hbfpabvwinn2fi

Introduction to the Cell multiprocessor

J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, D. Shippy
2005 IBM Journal of Research and Development  
In the end, however, it is the hard work of more than four hundred engineers and their managers that turned this concept into a reality.  ...  John Kelly, Lisa Su, and Bijan Davari from IBM and senior management in the three companies created the right business conditions for this project.  ...  If a data or instruction fetch misses in the caches, resulting in an access to main memory, instruction processing can only proceed in a speculative manner, assuming that the access to main memory will  ... 
doi:10.1147/rd.494.0589 fatcat:7nj6ionujfcl7oxmv5vvpfjyhm

FASTER run-time reconfiguration management

Cătălin Bogdan Ciobanu, Dionisios N. Pnevmatikatos, Kyprianos D. Papadimitriou, Georgi N. Gaydadjiev
2013 Proceedings of the 27th international ACM conference on International conference on supercomputing - ICS '13  
The FASTER project Run-Time System Manager offloads programmers from low-level operations by performing task placement, scheduling, and dynamic FPGA reconfiguration.  ...  It also manages device fragmentation, configuration caching, pre-fetching and reuse, bitstream compression, and optimizes the system thermal and power footprints.  ...  Acknowledgments This work was supported by the European Commission in the context of FP7 FASTER project (#287804).  ... 
doi:10.1145/2464996.2467283 dblp:conf/ics/CiobanuPPG13 fatcat:pzlky5ilfjalliwham3w6ykkde

Processing Data Where It Makes Sense: Enabling In-Memory Computation [article]

Onur Mutlu, Saugata Ghose, Juan Gómez-Luna, Rachata Ausavarungnirun
2019 arXiv   pre-print
Our focus is on the development of in-memory processing designs that can be adopted in real computing platforms at low cost.  ...  We discuss at least two promising directions for processing-in-memory (PIM): (1) performing massively-parallel bulk operations in memory by exploiting the analog operational properties of DRAM, with low-cost  ...  PEI: PIM-Enabled Instructions PIM-Enabled Instructions (PEI) [35] aims to provide the minimal processing-in-memory support to take advantage of PIM using 3D-stacked memory, in a way that can achieve  ... 
arXiv:1903.03988v1 fatcat:l2sl2wqwmrejvfbi3sxrpwasby

A Modern Primer on Processing in Memory [article]

Onur Mutlu, Saugata Ghose, Juan Gómez-Luna, Rachata Ausavarungnirun
2020 arXiv   pre-print
This chapter discusses recent research that aims to practically enable computation close to data, an approach we call processing-in-memory (PIM).  ...  PIM places computation mechanisms in or near where the data is stored (i.e., inside the memory chips, in the logic layer of 3D-stacked memory, or in the memory controllers), so that data movement between  ...  In summary, PIM-Enabled Instructions provide the illusion that PIM operations are executed as if they were host instructions: the programmer may not even be aware that the code is executing on a PIM-capable  ... 
arXiv:2012.03112v1 fatcat:hq2i2xzq4nbszenq7rqmjzcjci

Process migration-based computational offloading framework for IoT-supported mobile edge/cloud computing

Abdullah Yousafzai, Ibrar Yaqoob, Muhammad Imran, Abdullah Gani, Rafidah Md Noor
2019 IEEE Internet of Things Journal  
In this paper, we analyze the effect of platform-dependent native applications on computational offloading in edge networks and propose a lightweight process migration-based computational offloading framework  ...  Hence, the proposed framework shows profound potential for resource-intensive IoT application processing in MEC.  ...  Lastly, all user-space memory is stored to the checkpoint image (/proc/self/maps), which includes any library that the process is using.  ... 
doi:10.1109/jiot.2019.2943176 fatcat:7to2d5aqnndthjvx47bzluonoe

How to Reach Real-Time AI on Consumer Devices? Solutions for Programmable and Custom Architectures [article]

Stylianos I. Venieris and Ioannis Panopoulos and Ilias Leontiadis and Iakovos S. Venieris
2021 arXiv   pre-print
to users, in a robust and efficient manner.  ...  Collectively, these results highlight the critical need for further exploration as to how the various cross-stack solutions can be best combined in order to bring the latest advances in deep learning close  ...  Adopting a different viewpoint, MASA [88] comprises a memory-aware scheduler for minimising the memory swapping between DNNs.  ... 
arXiv:2106.15021v1 fatcat:b25jifosajeuba57qxiaockmg4

PIM-enabled instructions

Junwhan Ahn, Sungjoo Yoo, Onur Mutlu, Kiyoung Choi
2015 Proceedings of the 42nd Annual International Symposium on Computer Architecture - ISCA '15  
Processing-in-memory (PIM) is rapidly rising as a viable solution for the memory wall crisis, rebounding from its unsuccessful attempts in 1990s due to practicality concerns, which are alleviated with  ...  The key idea is to implement simple in-memory computation using compute-capable memory commands and use specialized instructions, which we call PIM-enabled instructions, to invoke in-memory computation  ...  10041608, Embedded System Software for New Memory-based Smart Devices).  ... 
doi:10.1145/2749469.2750385 dblp:conf/isca/AhnYMC15 fatcat:j5lepsilqfedzdxp2iu5ozbvji

PIM-enabled instructions

Junwhan Ahn, Sungjoo Yoo, Onur Mutlu, Kiyoung Choi
2015 SIGARCH Computer Architecture News  
Processing-in-memory (PIM) is rapidly rising as a viable solution for the memory wall crisis, rebounding from its unsuccessful attempts in 1990s due to practicality concerns, which are alleviated with  ...  The key idea is to implement simple in-memory computation using compute-capable memory commands and use specialized instructions, which we call PIM-enabled instructions, to invoke in-memory computation  ...  10041608, Embedded System Software for New Memory-based Smart Devices).  ... 
doi:10.1145/2872887.2750385 fatcat:ckgc7e3kxrcdlde6z32e3vxsu4

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

Xingqi Zou, Sheng Xu, Xiaoming Chen, Liang Yan, Yinhe Han
2021 Science China Information Sciences  
Keywords processing-in-memory (PIM), von Neumann bottleneck, memory wall, PIM simulator, architecture-level PIM Citation Zou X Q, Xu S, Chen X M, et al.  ...  Processing-in-memory (PIM) has been proposed as a promising solution to break the von Neumann bottleneck by minimizing data movement between memory hierarchies.  ...  Processing-in-memory, also known as logic-in-memory or near-data processing, can be divided into the device level, circuit level, and architectural level.  ... 
doi:10.1007/s11432-020-3227-1 fatcat:2vk6iycqxvcpdjq2vlxlhhd6hi

Hardware-Accelerated Platforms and Infrastructures for Network Functions: A Survey of Enabling Technologies and Research Studies

Prateek Shantharama, Akhilesh S. Thyagaturu, Martin Reisslein
2020 IEEE Access  
In DAX mode, the applications and OS have to be PM memory aware such that dedicated CPU load and store instructions specific to PM memory access are used for the transactions between CPU and NVDIMMs.  ...  In addition to the OS and hypervisors managing the CPU resources, the NF application designers can become aware of the CPU capabilities through the CPU instruction CPUID and develop strategies to run the  ... 
doi:10.1109/access.2020.3008250 fatcat:kv4znpypqbatfk2m3lpzvzb2nu

Data reorganization in memory using 3D-stacked DRAM

Berkin Akin, Franz Franchetti, James C. Hoe
2015 Proceedings of the 42nd Annual International Symposium on Computer Architecture - ISCA '15  
the memory hierarchy.  ...  For the various test cases, in-memory data reorganization provides orders of magnitude performance and energy efficiency improvements via low overhead hardware.  ...  The content, views and conclusions presented in this document do not necessarily reflect the position or the policy of DARPA or the U.S. Government, no official endorsement should be inferred.  ... 
doi:10.1145/2749469.2750397 dblp:conf/isca/AkinFH15 fatcat:233grv4wkvbythw4d6jp4hdiv4

Data reorganization in memory using 3D-stacked DRAM

Berkin Akin, Franz Franchetti, James C. Hoe
2015 SIGARCH Computer Architecture News  
the memory hierarchy.  ...  For the various test cases, in-memory data reorganization provides orders of magnitude performance and energy efficiency improvements via low overhead hardware.  ...  The content, views and conclusions presented in this document do not necessarily reflect the position or the policy of DARPA or the U.S. Government, no official endorsement should be inferred.  ... 
doi:10.1145/2872887.2750397 fatcat:qiqveldvz5gudbehcr7mzwa7n4

Design and analysis of 3D-MAPS: A many-core 3D processor with stacked memory

Michael B. Healy, Krit Athikulwongse, Rohan Goel, Mohammad M. Hossain, Dae Hyun Kim, Young-Joon Lee, Dean L. Lewis, Tzu-Wei Lin, Chang Liu, Moongon Jung, Brian Ouellette, Mohit Pathak (+7 others)
2010 IEEE Custom Integrated Circuits Conference 2010  
We describe the design and analysis of 3D-MAPS, a 64core 3D-stacked memory-on-processor running at 277 MHz with 63 GB/s memory bandwidth, sent for fabrication using Tezzaron's 3D stacking technology.  ...  When the memory instruction is absent, our ISA allows certain commonly used ALU instructions to be executed in the memory pipeline.  ...  In our twoway instruction format, one slot is dedicated to a memory instruction to consume memory bandwidth every cycle from the 3D-stacked memory while the other slot is tailored for an ALU instruction  ... 
doi:10.1109/cicc.2010.5617464 dblp:conf/cicc/HealyAGHKLLLLJOPSSWZLLL10 fatcat:ktjxy5ugh5glvkzve2l7qzdztm
« Previous Showing results 1 — 15 out of 496 results