Filters








1,085 Hits in 5.7 sec

Compiler-directed array interleaving for reducing energy in multi-bank memories

V. Delaluz, M. Kandemir, N. Vijaykrishnan, M.J. Irwin, A. Sivasubramaniam, I. Kolcu
Proceedings of ASP-DAC/VLSI Design 2002. 7th Asia and South Pacific Design Automation Conference and 15h International Conference on VLSI Design  
We propose a sourcelevel data space transformation technique called array interleaving that colocates simultaneously used array elements in a small set of memory modules.  ...  We validate the effectiveness of this transformation using a set of array-dominated benchmarks and observe significant savings in memory energy.  ...  Based on these results, we conclude that array interleaving is very beneficial from an energy viewpoint and should be supported by compilers targeting multi-bank memory systems.  ... 
doi:10.1109/aspdac.2002.994936 dblp:conf/vlsid/DelaluzKVISK02 fatcat:rhliy7w5ojaz7lk2rixxlkweke

Exploiting Both Pipelining and Data Parallelism with SIMD Reconfigurable Architecture [chapter]

Yongjoo Kim, Jongeun Lee, Jinyong Lee, Toan X. Mai, Ingoo Heo, Yunheung Paek
2012 Lecture Notes in Computer Science  
In this paper we introduce SIMD reconfigurable architecture, which allows for SIMD mapping at multiple levels of granularity, and investigate ways to minimize bank conflicts in a SIMD reconfigurable architecture  ...  Reconfigurable Architecture (RA), which provides extremely high energy efficiency for certain domains of applications, have one problem that current mapping algorithms for it do not scale well with the  ...  In macrocore mapping with interleaving, bank conflicts can be inevitable array elements that are never accessed. This will effectively reduce the number of banks, and increase bank conflicts.  ... 
doi:10.1007/978-3-642-28365-9_4 fatcat:n2mauiwx65a2vic6wripfqjkm4

Code Transformations for TLB Power Reduction

Reiley Jeyapaul, Aviral Shrivastava
2010 International journal of parallel programming  
In this paper, we propose compiler techniques (specifically, instruction and operand reordering, array interleaving, and loop unrolling) to reduce the page switchings in data accesses.  ...  The Translation Look-aside Buffer (TLB) is a very important part in the hardware support for virtual memory management implementation of high performance embedded systems.  ...  Compiler based Approaches A compiler-directed array interleaving technique [17] was proposed to save energy in multi-bank memory architectures with power control features.  ... 
doi:10.1007/s10766-009-0123-8 fatcat:yyfwuk2qvjdg3g55mudtllezqe

Code Transformations for TLB Power Reduction

Reiley Jeyapaul, Sandeep Marathe, Aviral Shrivastava
2009 2009 22nd International Conference on VLSI Design  
In this paper, we propose compiler techniques (specifically, instruction and operand reordering, array interleaving, and loop unrolling) to reduce the page switchings in data accesses.  ...  The Translation Look-aside Buffer (TLB) is a very important part in the hardware support for virtual memory management implementation of high performance embedded systems.  ...  Compiler based Approaches A compiler-directed array interleaving technique [17] was proposed to save energy in multi-bank memory architectures with power control features.  ... 
doi:10.1109/vlsi.design.2009.39 dblp:conf/vlsid/JeyapaulMS09 fatcat:gcc2rwucjvchdajm2rcqczk5oe

I-cache multi-banking and vertical interleaving

Sangyeun Cho
2007 Proceedings of the 17th great lakes symposium on Great lakes symposium on VLSI - GLSVLSI '07  
Unlike previous multi-banking and interleaving techniques to increase cache bandwidth, the proposed vertical interleaving further divides memory banks in a cache into vertically arranged sub-banks, which  ...  This research investigates the impact of a microarchitectural technique called vertical interleaving in multi-banked caches.  ...  In addition to implementing multiple ports for higher bandwidth, cache multi-banking has been used to reduce access time, to reduce energy consumption, or to facilitate aspect ratio fitting [20, 21] .  ... 
doi:10.1145/1228784.1228794 dblp:conf/glvlsi/Cho07 fatcat:d7mbnjscarepxanuvq7rdunwxa

Automatic data migration for reducing energy consumption in multi-bank memory systems

V. De La Luz, M. Kandemir, I. Kolcu
2002 Proceedings - Design Automation Conference  
An architectural solution to reducing memory energy consumption is to adopt a multi-bank memory system instead of a monolithic (single-bank) memory system.  ...  Some recent multi-bank memory architectures help reduce memory energy by allowing an unused bank to be placed into a low-power operating mode.  ...  In a multi-bank memory architecture, unused memory banks (idle banks) can be disabled, thereby saving energy.  ... 
doi:10.1145/513918.513973 dblp:conf/dac/LuzKK02 fatcat:wxb2ku2rxzbwjhooxw3qnshefe

Automatic data migration for reducing energy consumption in multi-bank memory systems

V. De La Luz, M. Kandemir, I. Kolcu
2002 Proceedings 2002 Design Automation Conference (IEEE Cat. No.02CH37324)  
An architectural solution to reducing memory energy consumption is to adopt a multi-bank memory system instead of a monolithic (single-bank) memory system.  ...  Some recent multi-bank memory architectures help reduce memory energy by allowing an unused bank to be placed into a low-power operating mode.  ...  In a multi-bank memory architecture, unused memory banks (idle banks) can be disabled, thereby saving energy.  ... 
doi:10.1109/dac.2002.1012622 fatcat:hvmjx6bfzffcdlkoguv6r725tm

Automatic data migration for reducing energy consumption in multi-bank memory systems

V. De La Luz, M. Kandemir, I. Kolcu
2002 Proceedings - Design Automation Conference  
An architectural solution to reducing memory energy consumption is to adopt a multi-bank memory system instead of a monolithic (single-bank) memory system.  ...  Some recent multi-bank memory architectures help reduce memory energy by allowing an unused bank to be placed into a low-power operating mode.  ...  In a multi-bank memory architecture, unused memory banks (idle banks) can be disabled, thereby saving energy.  ... 
doi:10.1145/513972.513973 fatcat:7hhssns2z5f4pi7vsuocjxd4ti

A Co-Design Framework with OpenCL Support for Low-Energy Wide SIMD Processor

Dongrui She, Yifan He, Luc Waeijen, Henk Corporaal
2014 Journal of Signal Processing Systems  
This compiler can analyze the static memory access patterns in OpenCL kernels, generate efficient mappings, and schedule the code to fully utilize the explicit datapath.  ...  In this paper, we propose a design framework for a configurable wide SIMD architecture that utilizes an explicit datapath to achieve high energy efficiency.  ...  In addition, further changes in the architecture, e.g., clustering PE memory banks to reduce memory energy, also require the adaption of the compiler.  ... 
doi:10.1007/s11265-014-0957-1 fatcat:bydv4yarnjcuhgmjehzj4r2i4e

Energy-oriented compiler optimizations for partitioned memory architectures

V. Delaluz, M. Kandemir, N. Vijaykrishnan, M. J. Irwin
2000 Proceedings of the international conference on Compilers, architectures, and synthesis for embedded systems - CASES '00  
This paper presents a compiler-based optimization framework that targets reducing the energy consumption in a partitioned off-chip memory architecture that contains multiple memory banks by organizing  ...  The optimizations considered in this work take advantage of low-power operating modes and the partitioned (multi-bank) structure of the off-chip memory.  ...  Specifically, we make the following contributions: We summarize the operation of a multi-bank memory system and explain how low-power operating modes can reduce its energy consumption.  ... 
doi:10.1145/354880.354900 dblp:conf/cases/DelaluzKVI00 fatcat:httjchxxnzhixeprmihjlyoc6y

An Energy-Efficient Integrated Programmable Array Accelerator and Compilation flow for Near-Sensor Ultra-low Power Processing

Satyajit Das, Kevin J. M. Martin, Davide Rossi, Philippe Coussy, Luca Benini
2018 IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  
To optimize the performance and energy efficiency, we explore the IPA architecture with special focus on shared memory access, with the help of the flexible compilation flow presented in this paper.  ...  In this paper we give a fresh look to Coarse Grained Reconfigurable Arrays (CGRAs) as ultra-low power accelerators for near-sensor processing.  ...  We also thank STMicroelectronics for granting access to the FDSOI 28nm technology libraries.  ... 
doi:10.1109/tcad.2018.2834397 fatcat:hf6fekk4ivdkhncpycvpabsswq

Synergistic Architecture and Programming Model Support for Approximate Micropower Computing

Giuseppe Tagliavini, Davide Rossi, Luca Benini, Andrea Marongiu
2015 2015 IEEE Computer Society Annual Symposium on VLSI  
Energy consumption is a major constraining factor for embedded multi-core systems. Using aggressive voltage scaling can reduce power consumption, but memory operations become unreliable.  ...  Several embedded applications exhibit inherent tolerance to computation approximation, for which indicating parts that can tolerate errors has proven a viable way to reduce energy consumption.  ...  This allows concurrent access to memory locations mapped on different banks, via a one-cycle-latency logarithmic interconnect implementing word-level interleaving to reduce contention.  ... 
doi:10.1109/isvlsi.2015.64 dblp:conf/isvlsi/TagliaviniRBM15 fatcat:do3naxkiibh33lxwddhvyhwiwq

Interconnect synthesis of heterogeneous accelerators in a shared memory architecture

Yu-Ting Chen, Jason Cong
2015 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)  
The second layer of interconnect tries to interleave possible conflicting long-burst memory requests for prefetching data from off-chip memory.  ...  Furthermore, the performance of an ARA can be improved by 36% -52% with a well-designed interleaved network in a real ARA prototype for medical imaging applications.  ...  port index 2: d: the array recording the memory bank demands for all accelerators 3: n: the number of accelerators 4: m: the number of memory banks 5: c: the number of simultaneous powered-on accelerators  ... 
doi:10.1109/islped.2015.7273540 dblp:conf/islped/ChenC15 fatcat:cnjlygfbtzeihg34ldmeyskmqm

RowClone

Vivek Seshadri, Michael A. Kozuch, Todd C. Mowry, Yoongu Kim, Chris Fallin, Donghyuk Lee, Rachata Ausavarungnirun, Gennady Pekhimenko, Yixin Luo, Onur Mutlu, Phillip B. Gibbons
2013 Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture - MICRO-46  
Our results show that RowClone can signi cantly improve both single-core and multi-core system performance, while also signi cantly reducing main memory bandwidth and energy consumption.  ...  In this work, we propose RowClone, a new and simple mechanism to perform bulk copy and initialization completely within DRAM -eliminating the need to transfer any data over the memory channel to perform  ...  This research was partially supported by NSF (CCF-0953246, CCF-1147397, CCF-1212962), Intel University Research O ce Memory Hierarchy Program, Intel Science and Technology Center for Cloud Computing, and  ... 
doi:10.1145/2540708.2540725 dblp:conf/micro/SeshadriKFLAPLMGKM13 fatcat:raul5gjizzan3knaladyxquvze

Data access optimization in a processing-in-memory system

Zehra Sura, Kevin O'Brien, Ravi Nair, Arpith Jacob, Tong Chen, Bryan Rosenburg, Olivier Sallenave, Carlo Bertolli, Samuel Antao, Jose Brunheroto, Yoonho Park
2015 Proceedings of the 12th ACM International Conference on Computing Frontiers - CF '15  
In this paper, we describe a combination of programming language features, compiler techniques, operating system interfaces, and hardware design that can effectively hide memory latency for the processing  ...  The AMC architecture includes general-purpose host processors and specially designed in-memory processors (processing lanes) that would be integrated in a logic layer within 3D DRAM memory.  ...  Further, the proximity to memory reduces the latency of memory accesses. • No caches or scratchpad memory for in-memory processors: This design choice saves area and power that would have been spent on  ... 
doi:10.1145/2742854.2742863 dblp:conf/cf/SuraJCRSBABPON15 fatcat:yelf7omghbc2jmfia7kcv73yeq
« Previous Showing results 1 — 15 out of 1,085 results