Filters








1,006 Hits in 7.8 sec

A selective compressed memory system by on-line data decompressing

Jang-Soo Lee, Won-Kee Hong, Shin-Dug Kim
1999 Proceedings 25th EUROMICRO Conference. Informatics: Theory and Practice for the New Millennium  
The selective technique compression can reduce decompression overhead caused by an on-line data decompression and also the fixed memory space allocation allows efficient management of the compressed blocks  ...  Furthermore, a large amount of the decompression overhead can be reduced, and thus the average memory access time can also be reduced by maximum 20% against the conventional memory systems.  ...  To reduce or hide the decompression overhead, SCMS employs several effective techniques such as selective compression, parallel decompression, decompression buffer, and so on.  ... 
doi:10.1109/eurmic.1999.794470 dblp:conf/euromicro/LeeHK99 fatcat:hjpagiomubclvlfns4imb5phg4

C-Pack: A High-Performance Microprocessor Cache Compression Algorithm

Xi Chen, Lei Yang, Robert P Dick, Li Shang, Haris Lekatsas
2010 IEEE Transactions on Very Large Scale Integration (vlsi) Systems  
Accessing off-chip memory generally takes an order of magnitude more time than accessing on-chip cache, and two orders of magnitude more time than executing an instruction.  ...  Microprocessor designers have been torn between tight constraints on the amount of on-chip cache memory and the high latency of off-chip memory, such as dynamic random access memory.  ...  ACKNOWLEDGMENT The authors would like to acknowledge A. Alameldeen at Intel Corporation for his help understanding his cache compression research results.  ... 
doi:10.1109/tvlsi.2009.2020989 fatcat:vl2hhmdxfzealpkjnd6z2a6z7e

Design and Implementation of a High-Performance Microprocessor Cache Compression Algorithm

Xi Chen, Lei Yang, Haris Lekatsas, Robert P. Dick, Li Shang
2008 Data Compression Conference (DCC), Proceedings  
We present a lossless compression algorithm that has been designed for on-line memory hierarchy compression, and cache compression in particular.  ...  However, most past work, and in particular work on cache compression, has made unsubstantiated assumptions about the performance, power consumption, and area overheads of the required compression hardware  ...  Cache compression is one such technique; data in lastlevel on-chip caches, e.g., L2 caches, are compressed, resulting in larger usable caches.  ... 
doi:10.1109/dcc.2008.90 dblp:conf/dcc/ChenYLDS08 fatcat:w6ryse6s7jagpdl23m2rq3ep5e

A VLSI Approach for Cache Compression in Microprocessor

Sharada Guptha M N, H. S. Pradeep, M Z Kurian
2011 International Journal of Instrumentation Control and Automation  
Because of these, designers of memory system may find cache compression as an advantageous method to increase speed of a microprocessor based system, as it increases cache capacity and off-chip bandwidth  ...  The However, most past work, and all work on cache compression, has made unsubstantiated assumptions about the performance, power consumption, and area overheads of the proposed compression algorithms  ...  Cache lines are compressed to predetermined sizes that never exceed their original size to reduce decompression overhead.  ... 
doi:10.47893/ijica.2011.1034 fatcat:r4etyf4trndyndjkafbmwinnfi

C-Pack: Cache Compression for Microprocessor Performance

T. Narasimhulu
2011 International Journal of Power System Operation and Energy Management  
In this work, I present a lossless compression algorithm that has been designed for fast on-line data compression, and cache compression in particular.  ...  However, most past work, and all work on cache compression, has made unsubstantiated assumptions about the performance, power consumption, and area overheads of the proposed compression algorithms and  ...  Cache compression is one such technique; data in last-level on-chip caches, e.g., L2 caches, are compressed, resulting in larger usable caches.  ... 
doi:10.47893/ijpsoem.2011.1019 fatcat:gkjbzbg7nvelbat33wnhdqndkm

Base-delta-immediate compression

Gennady Pekhimenko, Vivek Seshadri, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry
2012 Proceedings of the 21st international conference on Parallel architectures and compilation techniques - PACT '12  
Cache compression is a promising technique to increase on-chip cache capacity and to decrease on-chip and off-chip bandwidth usage.  ...  Compared to prior cache compression approaches, our studies show that B∆I strikes a sweet-spot in the tradeoff between compression ratio, decompression/compression latencies, and hardware complexity.  ...  We thank Greg Ganger, Kayvon Fatahalian and Michael Papamichael for their feedback on this paper's writing.  ... 
doi:10.1145/2370816.2370870 dblp:conf/IEEEpact/PekhimenkoSMGKM12 fatcat:lakldkz74bhsfetstbd5kycrd4

Performance and power optimization through data compression in Network-on-Chip architectures

Reetuparna Das, Asit K. Mishra, Chrysostomos Nicopoulos, Dongkook Park, Vijaykrishnan Narayanan, Ravishankar Iyer, Mazin S. Yousif, Chita R. Das
2008 High-Performance Computer Architecture  
We also address techniques to hide the decompression latency by overlapping with NoC communication latency.  ...  The trend towards integrating multiple cores on the same die has accentuated the need for larger on-chip caches.  ...  This technique will save three clock cycles per decompression operation, effectively reducing the overhead from 5 cycles to 2 cycles.  ... 
doi:10.1109/hpca.2008.4658641 dblp:conf/hpca/DasMNPNIYD08 fatcat:su5ocscq6bhajeqrlqjueltxmm

MemSZ

Albin Eldstål-Ahrens, Ioannis Sourdis
2020 ACM Transactions on Architecture and Code Optimization (TACO)  
The first variant has a shared last-level cache (LLC) on the processor-die, which is modified to store both compressed and uncompressed data.  ...  Compared to the current state-of-the-art lossy memory compression design, MemSZ improves the execution time, energy, and memory traffic by up to 15%, 9%, and 64%, respectively.  ...  As in the previous design, the empty space following a compressed memory block is used for lazy eviction of dirty cache lines, a technique designed to reduce and postpone recompression overhead, discussed  ... 
doi:10.1145/3424668 fatcat:kcbtlsy5pvemniapnvaxqbgeqa

Practical Data Compression for Modern Memory Hierarchies [article]

Gennady Pekhimenko
2016 arXiv   pre-print
In this thesis, we describe a new, practical approach to integrating hardware-based data compression within the memory hierarchy, including on-chip caches, main memory, and both on-chip and off-chip interconnects  ...  Third, we propose a new main memory compression framework, Linearly Compressed Pages (LCP), that significantly reduces the complexity and power cost of supporting main memory compression.  ...  either due to unacceptable compression/decompression latency or high design complexity and high overhead to support variable size blocks after compression.  ... 
arXiv:1609.02067v1 fatcat:i4z7m2ydtjgwvlwmglno26nb54

Decoupled Compressed Cache: Exploiting Spatial Locality for Energy Optimization

Somayeh Sardashti, David A. Wood
2014 IEEE Micro  
while using area overhead comparable to previous cache-compression techniques.  ...  , on-chip network, and off-chip memory.  ... 
doi:10.1109/mm.2014.42 fatcat:nbpgytodmnhfnijjx2seu2g7gi

A case for core-assisted bottleneck acceleration in GPUs

Nandita Vijaykumar, Gennady Pekhimenko, Adwait Jog, Abhishek Bhowmick, Rachata Ausavarungnirun, Chita Das, Mahmut Kandemir, Todd C. Mowry, Onur Mutlu
2015 Proceedings of the 42nd Annual International Symposium on Computer Architecture - ISCA '15  
We provide a comprehensive design and evaluation of CABA to perform e ective and exible data compression in the GPU memory hierarchy to alleviate the memory bandwidth bottleneck.  ...  This paper introduces the Core-Assisted Bottleneck Acceleration (CABA) framework that employs idle on-chip resources to alleviate di erent bottlenecks in GPU execution.  ...  Special thanks to Evgeny Bolotin, Saugata Ghose and Kevin Hsieh for their feedback during various stages of this project.  ... 
doi:10.1145/2749469.2750399 dblp:conf/isca/VijaykumarPJ0AD15 fatcat:vow55cmt3zhlxmg5o3x2lxx6ri

A case for core-assisted bottleneck acceleration in GPUs

Nandita Vijaykumar, Gennady Pekhimenko, Adwait Jog, Abhishek Bhowmick, Rachata Ausavarungnirun, Chita Das, Mahmut Kandemir, Todd C. Mowry, Onur Mutlu
2015 SIGARCH Computer Architecture News  
We provide a comprehensive design and evaluation of CABA to perform e ective and exible data compression in the GPU memory hierarchy to alleviate the memory bandwidth bottleneck.  ...  This paper introduces the Core-Assisted Bottleneck Acceleration (CABA) framework that employs idle on-chip resources to alleviate di erent bottlenecks in GPU execution.  ...  Special thanks to Evgeny Bolotin, Saugata Ghose and Kevin Hsieh for their feedback during various stages of this project.  ... 
doi:10.1145/2872887.2750399 fatcat:mdd25bfj25frrnazvn5aj2cfxm

AVR

Albin Eldstål-Damlin, Pedro Trancoso, Ioannis Sourdis
2019 Proceedings of the 48th International Conference on Parallel Processing - ICPP 2019  
The proposed AVR architecture supports our compression scheme maximizing its effect and minimizing its overheads by (i) co-locating in the Last Level Cache (LLC) compressed and uncompressed data, (ii)  ...  efficiently handling LLC evictions, (iii) keeping track of badly compressed memory blocks, and (iv) avoiding LLC pollution with unwanted decompressed data.  ...  First, existing designs for lossless memory and cache compression are presented and subsequently an overview is provided on approximate computing techniques that improve the performance of memory systems  ... 
doi:10.1145/3337821.3337824 dblp:conf/icpp/Eldstal-DamlinT19 fatcat:cflqkugcazcqtfwfprjtinkhsy

A Framework for Accelerating Bottlenecks in GPU Execution with Assist Warps [article]

Nandita Vijaykumar, Gennady Pekhimenko, Adwait Jog, Saugata Ghose, Abhishek Bhowmick, Rachata Ausavarangnirun, Chita Das, Mahmut Kandemir, Todd C. Mowry, Onur Mutlu
2016 arXiv   pre-print
We provide a comprehensive design and evaluation of CABA to perform effective and flexible data compression in the GPU memory hierarchy to alleviate the memory bandwidth bottleneck.  ...  This work describes the Core-Assisted Bottleneck Acceleration (CABA) framework that employs idle on-chip resources to alleviate different bottlenecks in GPU execution.  ...  Special thanks to Evgeny Bolotin and Kevin Hsieh for their feedback during various stages of this project.  ... 
arXiv:1602.01348v1 fatcat:qbzuknzcyncrticap55x4i5dhi

Reducing memory space consumption through dataflow analysis

Ozcan Ozturk
2011 Computer languages, systems & structures  
To reduce the memory space consumption of embedded systems, this paper proposes a control flow graph (CFG) based technique.  ...  On the other hand, if the memory allocated to this basic block cannot be reclaimed, we try to compress this basic block.  ...  Acknowledgments This research is supported in part by TUBITAK Grant 108E233, by a Grant from IBM, and by a Marie Curie International Reintegration Grant within the Seventh European Community Framework  ... 
doi:10.1016/j.cl.2011.07.001 fatcat:iwrb3wujvrg3jjegntvr7filhu
« Previous Showing results 1 — 15 out of 1,006 results