Filters








13 Hits in 7.8 sec

Lossless Compression Decoders for Bitstreams and Software Binaries Based on High-Level Synthesis

Jian Yan, Junqi Yuan, Philip H. W. Leong, Wayne Luk, Lingli Wang
2017 IEEE Transactions on Very Large Scale Integration (vlsi) Systems  
Moreover, in order to balance the objectives of compression ratio, decompression throughput, and hardware resource overhead, various improvements and optimizations are proposed.  ...  As the density of FPGAs has greatly improved over the past few years, the size of configuration bitstreams grows accordingly.  ...  LZ77 [32] : LZ77 is a dictionary-based text compression scheme.  ... 
doi:10.1109/tvlsi.2017.2713527 fatcat:usmgiggxybfdnabfhigizkv5g4

FPGA-Based Lossless Data Compression using Huffman and LZ77 Algorithms

Suzanne Rigler, William Bishop, Andrew Kennings
2007 2007 Canadian Conference on Electrical and Computer Engineering  
Files compressed in hardware can be decompressed with the software version of GZIP. The flexibility of the design allows for hardware-based implementations using either FPGAs or ASICs.  ...  Unlike previous attempts to design hardware-based encoders [5, 6] , the design is compliant with GZIP specification and includes all three of the GZIP compression modes.  ...  The compression ratio was on average within 2% and reasonable runtime was recorded given the limitations of our FPGA prototype.  ... 
doi:10.1109/ccece.2007.315 fatcat:fovqdx5huvcdvil4npyhlhtami

Practical speculative parallelization of variable-length decompression algorithms

Hakbeom Jang, Channoh Kim, Jae W. Lee
2013 SIGPLAN notices  
Typically, the compressor splits the original data into blocks and compresses each block with variable-length codes, hence producing variable-length compressed blocks.  ...  With SDM we effectively parallelize three production-grade variable-length decompression algorithms-zlib, bzip2, and H.264-with maximum speedups of 2.50× and 8.53× (and geometric mean speedups of 1.96×  ...  and Narinet Inc. for their support with the Tilera machine.  ... 
doi:10.1145/2499369.2465557 fatcat:ggaax5zvfvcwvgaqscvb7d4lre

Practical speculative parallelization of variable-length decompression algorithms

Hakbeom Jang, Channoh Kim, Jae W. Lee
2013 Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems - LCTES '13  
Typically, the compressor splits the original data into blocks and compresses each block with variable-length codes, hence producing variable-length compressed blocks.  ...  With SDM we effectively parallelize three production-grade variable-length decompression algorithms-zlib, bzip2, and H.264-with maximum speedups of 2.50× and 8.53× (and geometric mean speedups of 1.96×  ...  and Narinet Inc. for their support with the Tilera machine.  ... 
doi:10.1145/2491899.2465557 fatcat:6qrv5idmszeknmmhrtbwsrwg3q

A Configurable Statistical Lossless Compression Core Based on Variable Order Markov Modeling and Arithmetic Coding

J.L. Nunez-Yanez, V.A. Chouliaras
2005 IEEE transactions on computers  
This novel lossless compression core offers innovative solutions to the computational issues in both stages of modelling and coding and delivers high compression efficiency and throughput.  ...  This type of statistical coding algorithms has long been regarded as being able to deliver very high compression ratios close to the information content of the source data.  ...  PERFORMANCE COMPARISON This section analyses the performance of the core in terms of compression ratio and throughput and compares it with other state of the art universal data compression algorithms implemented  ... 
doi:10.1109/tc.2005.171 fatcat:pxixiyazorb7va37ymwfrwrz6q

A Fine-Grained Multicasting of Configuration Data for Coarse-Grained Reconfigurable Architectures

Takuya KOJIMA, Hideharu AMANO
2019 IEICE transactions on information and systems  
Furthermore, since both a dynamic power consumption of the configuration controller and a configuration time are improved, it achieves 50.1% reduction of the energy consumption for the configuration with  ...  The proposed technique is based on a multicast configuration technique called RoMultiC, which reduces the configuration time by multicasting the same data to multiple PEs (Processing Elements) with two  ...  This work is supported by VLSI Design and Education Center (VDEC), the University of Tokyo in collaboration with Synopsys, Inc and Cadence Design Systems, Inc.  ... 
doi:10.1587/transinf.2018edp7336 fatcat:wzpeqrtn6reddbv24er5pm5xmy

Huffman-based code compression techniques for embedded processors

Talal Bonny, Jörg Henkel
2010 ACM Transactions on Design Automation of Electronic Systems  
Talal Bonny ii I am also very grateful to Professor Wael Adi for accepting to be my co-examiner and providing valuable feedback.  ...  Acknowledgements First and foremost, I would like gratefully and sincerely to thank my advisor Professor Jörg Henkel for enabling and supporting the research presented in this thesis and for offering an  ...  This dissertation could not have been written without his support and guidance.  ... 
doi:10.1145/1835420.1835424 fatcat:z25cwe2n3jfqpptg2ggprwbqsy

A Survey of Compressed GPU-Based Direct Volume Rendering [article]

Marcos Balsa Rodríguez, Enrico Gobbetti, José A. Iglesias Guitián, Maxim Makhinya, Fabio Marton, Renato Pajarola, Susanne K. Suter
2012 Eurographics State of the Art Reports  
To address this issue, a variety of level-of-detail data representations and compression techniques have been introduced.  ...  Compression and level-of-detail pre-computation does not have to adhere to real-time constraints and can be performed off-line for high quality results.  ...  Decompression and rendering As previously mentioned, asymmetric compression schemes are desired, as they are designed to provide fast decoding at runtime at the expense of increased (but high quality)  ... 
doi:10.2312/conf/eg2013/stars/117-136 fatcat:3cadb2miwngrjoaqmrmudww6lq

Program Analysis and Compiler Transformations for Computational Accelerators

Taylor Lloyd
2018
We present Run-Length Base-Delta (RLBD) encoding, a very high-speed compression format and algorithm capable of improving throughput of 40GbE up to 57% on datasets taken from the UCI Machine Learning Repository  ...  First this thesis studies the challenge of programming for FPGAs. IBM and Intel both now produce systems with integrated FPGAs, but FPGA programming remains extremely challenging.  ...  Schema selection is critical to the performance of the RLBD compressor to maximize both the compression ratio and the compression throughput.  ... 
doi:10.7939/r3z892x2m fatcat:xidz27urjrdurgqm4iis4bd2me

Transparent Memory Hierarchy Compression and Migration

Lei Yang
2008 unpublished
In particular, the techniques presented in this dissertation explore the use of software and hardware compression of physical and virtual memory to improve performance and functionality of uniprocessor  ...  Although many aspects of embedded system design and synthesis have received significant research attention, comparatively less attention has been given to new ideas in memory hierarchy design.  ...  Their results show that parallel compressors with cooperatively constructed dictionaries have compression efficiency essentially equivalent to that of the sequential LZ77 method. Wilson et al.  ... 
doi:10.21985/n22f2c fatcat:ka6flq5mifdgxpxomz4paomznm

GPU-Acceleration of In-Memory Data Analytics

Evangelia Sitaridi
2017
Due to the increasing memory capacity and also the user's need for fast interaction with the data, we focus on in-memory analytics.  ...  Hardware advances strongly influence the database system design.  ...  We also implement Gompresso/Byte, based on LZ77 with byte-level encoding. It trades off slightly lower compression ratios for an average of 3× higher decompression speed.  ... 
doi:10.7916/d8fn16bz fatcat:3prapmemhrhbfduzdgvarkpgbi

Adding limited reconfigurability to superscalar processors

M. Epalza, P. Ienne, D. Mlynek
Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004.  
For the last thirty years, electronics, at first built with discrete components, and then as Integrated Circuits (IC), have brought diverse and lasting improvements to our quality of life.  ...  Application-Specific Integrated Circuits (ASICs) were traditionally used for their high performance and low manufacturing cost, and were designed specifically for a single application with large volumes  ...  The benchmarks in the SPEC CPU 2000 suite are briefly presented below, beginning with the integer benchmarks: gzip is a popular data compression program which uses Lempel-Ziv coding (LZ77) as its compression  ... 
doi:10.1109/pact.2004.1342541 fatcat:kghcg5bwnzappluxmcsjxdd7s4

Efficient in-hardware compression of on-chip data

Amin Ghasemazar
2021
Thesaurus significantly improves the state-of-the-art cache compression ratio to 2.25×. Next, we apply our insights to special-purpose applications.  ...  Similar trends exist in special-purpose computing systems, with only up to tens of megabytes of on-chip memory available in most recent AI accelerators.  ...  Throughput- optimized OpenCL-based FPGA accelerator for large-scale convolu- tional neural networks. In FPGA, 2016. [239] Torsten Suel. Delta Compression Techniques. [240] Tony Summers.  ... 
doi:10.14288/1.0404515 fatcat:nxtj5xz4yffm7inyjyp7k6caem