Filters








193 Hits in 6.8 sec

Improving Prediction-Based Lossy Compression Dramatically Via Ratio-Quality Modeling [article]

Sian Jin, Sheng Di, Jiannan Tian, Suren Byna, Dingwen Tao, Franck Cappello
2021 arXiv   pre-print
Error-bounded lossy compression is one of the most effective techniques for scientific data reduction.  ...  Our analytical model significantly improves the prediction-based lossy compression in three use-cases: (1) optimization of predictor by selecting the best-fit predictor; (2) memory compression with a target  ...  Nuclear Security Administration, responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early testbed  ... 
arXiv:2111.09815v2 fatcat:pujdfwousjf6nhxktz44txqile

CEAZ: Accelerating Parallel I/O Via Hardware-Algorithm Co-Designed Adaptive Lossy Compression [article]

Chengming Zhang, Sian Jin, Tong Geng, Jiannan Tian, Ang Li, Dingwen Tao
2021 arXiv   pre-print
To this end, many previous works have studied using error-bounded lossy compressors to reduce the data size and improve the I/O performance.  ...  In this paper, we propose a hardware-algorithm co-design for an efficient and adaptive lossy compressor for scientific data on FPGAs (called CEAZ), which is the first lossy compressor that can achieve  ...  , when running the simulation for 5 times with 200 snapshots dumped per run.  ... 
arXiv:2106.13306v2 fatcat:42fvquu3trcgxncxwdl5izksra

Z-checker

Dingwen Tao, Sheng Di, Hanqi Guo, Zizhong Chen, Franck Cappello
2017 The international journal of high performance computing applications  
Because of vast volume of data being produced by today's scientific simulations and experiments, lossy data compressor allowing user-controlled loss of accuracy during the compression is a relevant solution  ...  For lossy compression users, Z-checker can detect the compression quality, provide various global distortion analysis comparing the original data with the decompressed data and statistical analysis of  ...  His research interests include high-performance computing, parallel and distributed systems, big data analytics, cluster and cloud computing, algorithm-based fault tolerance, power and energy efficient  ... 
doi:10.1177/1094342017737147 fatcat:mmcug266vjdbljhe2qfuwjrwtq

Fixed-PSNR Lossy Compression for Scientific Data [article]

Dingwen Tao, Sheng Di, Xin Liang, Zizhong Chen, Franck Cappello
2018 arXiv   pre-print
Error-controlled lossy compression has been studied for years because of extremely large volumes of data being produced by today's scientific simulations.  ...  In this paper, we propose a novel technique providing a fixed-PSNR lossy compression for scientific data sets.  ...  Security Administration, responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early testbed platforms  ... 
arXiv:1805.07384v3 fatcat:6fqkhblw7bc7bkctazb6jbhkfe

SDC Resilient Error-bounded Lossy Compressor [article]

Sihuan Li, Sheng Di, Kai Zhao, Xin Liang, Zizhong Chen, Franck Cappello
2020 arXiv   pre-print
In fact, SDC is becoming non-negligible because of exa-scale computing demand on complex scientific simulations with vast volume of data being produced or in some particular instruments/devices (such as  ...  It keeps the correctness of decompressed data still bounded within user's requirement with a very limited degradation of compression ratios upon soft errors.  ...  To the best of our knowledge, no ABFT work has been done for lossy compression algorithms, which is a significant gap in the context of scientific data compression.  ... 
arXiv:2010.03144v1 fatcat:35b7j4goc5fm3fbqzes77vmyam

Fast and Efficient Compression of Floating-Point Data

Peter Lindstrom, Martin Isenburg
2006 IEEE Transactions on Visualization and Computer Graphics  
We propose a simple scheme for lossless, online compression of floating-point data that transparently integrates into the I/O of many applications.  ...  Large scale scientific simulation codes typically run on a cluster of CPUs that write/read time steps to/from a single file system.  ...  Acknowledgements This work was performed in part under the auspices of the U.S. DOE by LLNL under contract no. W-7405-Eng-48, and was funded in part by NSF grant CCF-0430065.  ... 
doi:10.1109/tvcg.2006.143 pmid:17080858 fatcat:cnyrxbsw3ngg3lv255w6hnoxvi

SZ3: A Modular Framework for Composing Prediction-Based Error-Bounded Lossy Compressors [article]

Xin Liang, Kai Zhao, Sheng Di, Sihuan Li, Robert Underwood, Ali M. Gok, Jiannan Tian, Junjing Deng, Jon C. Calhoun, Dingwen Tao, Zizhong Chen, Franck Cappello
2021 arXiv   pre-print
Today's scientific simulations require a significant reduction of data volume because of extremely large amounts of data they produce and the limited I/O bandwidth and storage space.  ...  Experiments show that our customized compression pipelines lead to up to 20% improvement in compression ratios under the same data distortion compared with the state-of-the-art approaches.  ...  Nuclear Security Administration, responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early testbed  ... 
arXiv:2111.02925v2 fatcat:o2px4znevbgxhb7wl7g6nhfxpu

A Parallel Data Compression Framework for Large Scale 3D Scientific Data [article]

Panagiotis Hadjidoukas, Fabian Wermelinger
2019 arXiv   pre-print
The software yields in situ compression ratios of 100x or higher for fluid dynamics data produced by petascale simulations of cloud cavitation collapse using O(10^11) grid cells, with negligible impact  ...  In this work, we address these challenges through a novel software framework for scientific data compression.  ...  Petros Koumoutsakos for providing feedback during manuscript drafting. References  ... 
arXiv:1903.07761v1 fatcat:h6d6m7oyhbhejgjlns4rmuzsqa

Improving performance of iterative methods by lossy checkponting

Dingwen Tao, Sheng Di, Xin Liang, Zizhong Chen, Franck Cappello
2018 Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing - HPDC '18  
formulate a lossy checkpointing performance model and derive theoretically an upper bound for the extra number of iterations caused by the distortion of data in lossy checkpoints, in order to guarantee  ...  the performance improvement under the lossy checkpointing scheme. (3) We analyze the impact of lossy checkpointing (i.e., extra number of iterations caused by lossy checkpointing files) for multiple types  ...  Patrick Bridges for his helpful suggestions for the final paper.  ... 
doi:10.1145/3208040.3208050 dblp:conf/hpdc/TaoDLCC18 fatcat:zqm2t3fnxvhuheu3em6pj5325y

Fast Error-Bounded Lossy HPC Data Compression with SZ

Sheng Di, Franck Cappello
2016 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS)  
As for the unpredictable data that cannot be approximated by curve-fitting models, we perform an optimized lossy compression via a binary representation analysis.  ...  The compression method starts by linearizing multi-dimensional snapshot data. The key idea is to fit/predict the successive data points with the bestfit selection of curve fitting models.  ...  for exascale scientific simulation.  ... 
doi:10.1109/ipdps.2016.11 dblp:conf/ipps/DiC16 fatcat:v3tbp37cjnaljktiigxattqnhy

The ATree: A Data Structure to Support Very Large Scientific Databases [chapter]

Pedja Bogdanovich, Hanan Samet
1999 Lecture Notes in Computer Science  
The reduced format allows larger datasets to be stored on local disk for analysis. Data indexing permits efficient manipulation of the data, and thus improves the productivity of the researcher.  ...  A data structure called the ATree is described that meets the demands of interactive scientific applications. The ATree data structure is suitable for storing data abstracts as well as original data.  ...  A climate simulation model is run for a simulation period of 30 years. At every 3-6 hours of simulation time the state of the model is dumped out.  ... 
doi:10.1007/3-540-46621-5_14 fatcat:gxksjlc7czczpaulzrfdslhqii

Scalable load-balance measurement for SPMD codes

Todd Gamblin, Bronis R. de Supinski, Martin Schulz, Rob Fowler, Daniel A. Reed
2008 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis  
Compression time scales sublinearly with system size and data volume is several orders of magnitude smaller than the raw data. The overhead is low enough for online use in a production environment.  ...  We show that our technique collects and reconstructs systemwide measurements with low error.  ...  For comparison, we performed an identical set of runs in which we dumped exhaustive data to disk. For these experiments, the exhaustive dump is done after the consolidation of rows.  ... 
doi:10.1109/sc.2008.5222553 dblp:conf/sc/GamblinSSFR08 fatcat:os7rhbvvmvaqlavku7dodu7qhm

Revisiting Huffman Coding: Toward Extreme Performance on Modern GPU Architectures [article]

Jiannan Tian, Cody Rivera, Sheng Di, Jieyang Chen, Xin Liang, Dingwen Tao, Franck Cappello
2020 arXiv   pre-print
Today's high-performance computing (HPC) applications are producing vast volumes of data, which are challenging to store and transfer efficiently during the execution, such that data compression is becoming  ...  Experiments show that our solution can improve the encoding throughput by up to 5.0X and 6.8X on NVIDIA RTX 5000 and V100, respectively, over the state-of-the-art GPU Huffman encoder, and by up to 3.3X  ...  Nuclear Security Administration, responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early testbed  ... 
arXiv:2010.10039v1 fatcat:ngpdnh24urdsfiot44smh2rjjm

Resilience and fault tolerance in high-performance computing for numerical weather and climate prediction

Tommaso Benacchio, Luca Bonaventura, Mirco Altenbernd, Chris D Cantwell, Peter D Düben, Mike Gillard, Luc Giraud, Dominik Göddeke, Erwan Raffin, Keita Teranishi, Nils Wedi
2021 The international journal of high performance computing applications  
Numerical examples showcase the performance of the techniques in addressing faults, with particular emphasis on iterative solvers for linear systems, a staple of atmospheric fluid flow solvers.  ...  Trade-offs between performance, efficiency and effectiveness of resiliency strategies are analysed and some recommendations outlined for future developments.  ...  Acknowledgements We thank the authors of Agullo et al. (2016a Agullo et al. ( , 2016b , namely E Agullo, L Giraud, A Guermouche, J Roman, P Salas, and M Zounon, for the permission to report the  ... 
doi:10.1177/1094342021990433 fatcat:tfhovb6xmfemtkgzzrkpiiiju4

A practical toolkit for computational steering

S.M. Pickles, R. Haines, R.L. Pinning, A.R. Porter
2005 Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences  
Computational steering refers to the real-time interaction of a scientist with their running simulation code.  ...  simulations.  ...  Steerable data dumping frequencies enable the user to increase the amount of generated data during periods of the simulation when events of particular interest are happening.  ... 
doi:10.1098/rsta.2005.1611 pmid:16099752 fatcat:xjsk4hkkdrgzphg4ujsr2d2hca
« Previous Showing results 1 — 15 out of 193 results