1,067 Hits in 6.3 sec

On the use of burst buffers for accelerating data-intensive scientific workflows

Rafael Ferreira da Silva, Scott Callaghan, Ewa Deelman
2017 Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science - WORKS '17  
In this paper, we examine the impact of burst buffers through the remote-shared, allocatable burst buffers on the Cori system at NERSC.  ...  By running a subset of the SCEC CyberShake workflow, a production seismic hazard analysis workflow, we find that using burst buffers offers read and write improvements of about an order of magnitude, and  ...  Section 2 provides background on data-intensive scientific workflows, and an overview of burst buffer architectures.  ... 
doi:10.1145/3150994.3151000 dblp:conf/sc/SilvaCD17 fatcat:tomvvnfqgnbdlbjyl67x4u7z5y

Scientific Workflows at DataWarp-Speed: Accelerated Data-Intensive Science Using NERSC's Burst Buffer

Andrey Ovsyannikov, Melissa Romanus, Brian Van Straalen, Gunther H. Weber, David Trebotich
2016 2016 1st Joint International Workshop on Parallel Data Storage and data Intensive Scalable Computing Systems (PDSW-DISCS)  
Emerging exascale systems have the ability to accelerate the time-to-discovery for scientific workflows.  ...  With respect to the proposed workflow, we study the performance of the Cray DataWarp Burst Buffer and provide a comparison with the Lustre parallel file system.  ...  This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. DOE under Contract No.  ... 
doi:10.1109/pdsw-discs.2016.005 dblp:conf/sc/OvsyannikovRSWT16 fatcat:tihjnym4knedpk76yq3e5sicne

On the Non-Suitability of Non-Volatility

John Bent, Brad Settlemyer, Nathan DeBardeleben, Sorin Faibish, Dennis Ting, Uday Gupta, Percy Tzelnic
2015 USENIX Workshop on Hot Topics in Storage and File Systems  
For these workloads, HPC data centers are deploying NAND flash as a storage acceleration tier, commonly called burst buffers, to provide high levels of write bandwidth for checkpoint storage.  ...  One such workload is long-running scientific applications which use checkpoint-restart for failure recovery.  ...  (a) Checkpoint-Restart Workflow with a Reliable Burst Buffer (b) Checkpoint-Restart Workflow with an Unreliable Burst Buffer Figure 2 : 2 Figure 2: Comparison of Reliable and Nonreliable Burst Buffers  ... 
dblp:conf/hotstorage/BentSDFTGT15 fatcat:ivjo36oufrhk7khboftwcsg6ee

Performance characterization of scientific workflows for the optimal use of Burst Buffers

C.S. Daley, D. Ghoshal, G.K. Lockwood, S. Dosanjh, L. Ramakrishnan, N.J. Wright
2017 Future generations computer systems  
Scientific discoveries are increasingly dependent upon the analysis of large volumes of data from observations and simulations of complex phenomena.  ...  Scientists compose the complex analyses as workflows and execute them on large-scale  ...  Burst Buffers. Several uses of BBs have been shown in order to mitigate the I/O bottlenecks of data-intensive workloads [6, 32, 33, 34] .  ... 
doi:10.1016/j.future.2017.12.022 fatcat:lfr7pqgm4na6depkae6dc4zthq

Understanding Data Motion in the Modern HPC Data Center

Glenn K. Lockwood, Shane Snyder, Suren Byna, Philip Carns, Nicholas J. Wright
2019 2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW)  
We show that parallel I/O from user jobs, while undeniably important, is only one of several major I/O workloads that occurs throughout the execution of scientific workflows.  ...  systematically identify coupled data motion for individual workflows.  ...  The ground-truth data volumes for the burst buffer were measured using NERSC's daily smartctl monitoring data.  ... 
doi:10.1109/pdsw49588.2019.00012 fatcat:ip42esv5pzbspd6iwzkceqs2gm

Prediction-based auto-scaling of scientific workflows

Reginald Cushing, Spiros Koulouzis, Adam S. Z. Belloum, Marian Bubak
2011 Proceedings of the 9th International Workshop on Middleware for Grids, Clouds and e-Science - MGC '11  
Tasks within a data-centric scientific workflow are often data dependent on each other where each task can, potentially, be a data intensive task.  ...  This is achieved through shadow queues on the central message exchange. A shadow queue acts as a buffer for the set of messages consumed by the parent.  ... 
doi:10.1145/2089002.2089003 dblp:conf/middleware/CushingKBB11 fatcat:wtdda46dxncufjcgcmpvwamgaa

NORNS: Extending Slurm to Support Data-Driven Workflows through Asynchronous Data Staging

Alberto Miranda, Adrian Jackson, Tommaso Tocci, Iakovos Panourgias, Ramon Nou
2019 2019 IEEE International Conference on Cluster Computing (CLUSTER)  
While the inclusion of burst buffers has helped to alleviate this by improving I/O performance, it has also increased the complexity of the I/O hierarchy by adding additional storage layers each with its  ...  This forces users to explicitly manage data movement between the different storage layers, which, coupled with the lack of interfaces to communicate data dependencies between jobs in a data-driven workflow  ...  First, burst buffers (shared or node-local) can be used (at least) as temporary storage for intermediate data, to cache PFS data, and as a medium to share data between workflow phases.  ... 
doi:10.1109/cluster.2019.8891014 dblp:conf/cluster/MirandaJTPN19 fatcat:hc7nopkglze3pe3njvkwkzhusi

Benefit of DDN's IME-FUSE for I/O Intensive HPC Applications [chapter]

Eugen Betke, Julian Kunkel
2018 Lecture Notes in Computer Science  
Then, it takes a closer look at the IME-FUSE file systems, which uses IMEs as burst buffer and a Lustre file system as back-end.  ...  This paper investigates the native performance of DDN-IME, a flashbased burst buffer solution.  ...  Acknowledgment Thanks to DDN for providing access to the IME test cluster and to Jean-Thomas Acquaviva for the support. 11/14  ... 
doi:10.1007/978-3-030-02465-9_9 fatcat:f3edktuqwjd6jdiqc6pet5ezwq

Sensitivity Analysis for Time Dependent Problems: Optimal Checkpoint-Recompute HPC Workflows

Varis Carey, Hasan Abbasi, Ivan Rodero, Hemanth Kolla
2014 2014 9th Workshop on Workflows in Support of Large-Scale Science  
However, one of the challenges of the adjoint workflow for time-dependent applications is the storage and I/O requirements for the application state.  ...  This approach drastically reduces the total volume of stored data, allows the caching of state in the regeneration window in memory and on local SSDs, may accelerate the application execution by reducing  ...  The authors wish to thank the members of the the ExaCT Center for Exascale Simulation of Combustion in Turbulence for useful discussions and support.  ... 
doi:10.1109/works.2014.15 dblp:conf/sc/CareyARK14 fatcat:4yfpvys5fjb3hnasg5uu2bn3fa

Challenges and Opportunities of User-Level File Systems for HPC (Dagstuhl Seminar 17202)

André Brinkmann, Kathryn Mohror, Weikuan Yu, Marc Herbstritt
2017 Dagstuhl Reports  
Although the benefits of hierarchical storage have been adequately demonstrated to the point that the newest leadership class HPC systems will employ burst buffers, critical questions remain for supporting  ...  How should we manage data movement through a storage hierarchy for best performance and resilience of data? How do the particular I/O use cases mandate the way we manage data?  ...  The talk Data Movement Requirements for HPC and Data-Intensive Burst Buffers from Dean Hildebrand focuses on the requirements to integrate burst buffers into the HPC environment.  ... 
doi:10.4230/dagrep.7.5.97 dblp:journals/dagstuhl-reports/BrinkmannMY17 fatcat:2bquax3oz5c5xlsoxdkiomotiy

Lessons Learned from Building In Situ Coupling Frameworks

Matthieu Dorier, Matthieu Dreher, Tom Peterka, Justin M. Wozniak, Gabriel Antoniu, Bruno Raffin
2015 Proceedings of the First Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization - ISAV2015  
Over the past few years, the increasing amounts of data produced by large-scale simulations have motivated a shift from traditional offline data analysis to in situ analysis and visualization.  ...  Going beyond this simple pairwise tight coupling, complex analysis workflows today are graphs with one or more data sources and several interconnected analysis components.  ...  This work was done in the framework of a collaboration between the KerData joint Inria -ENS Rennes -Insa Rennes team and Argonne National Laboratory within the Joint Laboratory for Extreme-Scale Computing  ... 
doi:10.1145/2828612.2828622 dblp:conf/sc/DorierDPWAR15 fatcat:vclffn2fu5c5ba5rcuys7qiimq

Status and progress of China SKA Regional Centre prototype [article]

Tao An, Xiaocong Wu, Baoqiang Lao, Shaoguang Guo, Zhijun Xu, Weijia Lv, Yingkang Zhang, Zhongli Zhang
2022 arXiv   pre-print
The computing resources needed to process, distribute, curate and use the vast amount of data that will be generated by the SKA telescopes are too large for the SKAO to manage on its own.  ...  The paper presents examples of scientific applications of SKA precursor and pathfinder telescopes completed using resources from the China SRC prototype.  ...  If all ARM compute nodes are used, then up to 960 compute cores can be used, which is very helpful for the compute-intensive use case like this one.  ... 
arXiv:2206.13022v1 fatcat:mrpcr5tdtveafafmikg3teaxjy

SDN for End-to-End Networked Science at the Exascale (SENSE)

Inder Monga, Chin Guok, John MacAuley, Alex Sim, Harvey Newman, Justas Balcas, Phil DeMar, Linda Winkler, Tom Lehman, Xi Yang
2018 2018 IEEE/ACM Innovating the Network for Data-Intensive Science (INDIS)  
The Software-defined network for End-to-end Networked Science at Exascale (SENSE) research project is building smart network services to accelerate scientific discovery in the era of 'big data' driven  ...  and highly tuned complex workflows that require close coupling of resources spread across a vast geographic footprint such as those used in science domains like high-energy physics and basic energy sciences  ...  DTN is used loosely here, could be a subset of the supercomputer nodes) • Write the data to the burst buffers layer (NVRAM) • Distribute the data from the burst buffers to the local memory on the HPC  ... 
doi:10.1109/indis.2018.00007 dblp:conf/sc/MongaGMSNBDWL018 fatcat:goux54vgbbebldcz7mzuverzwe

ASCR/HEP Exascale Requirements Review Report [article]

Salman Habib, Robert Roser, Richard Gerber, Katie Antypas, Katherine Riley, Tim Williams, Jack Wells, Tjerk Straatsma, A. Almgren, J. Amundson, S. Bailey, D. Bard, K. Bloom (+39 others)
2016 arXiv   pre-print
be available at ASCR facilities, e) to build up and train a workforce capable of developing and using simulations and analysis to support HEP scientific research on next-generation systems.  ...  To best use ASCR HPC resources the experimental HEP program needs a) an established long-term plan for access to ASCR computational and data resources, b) an ability to map workflows onto HPC resources  ...  Explicit access to manage intermediate data storage (e.g., burst buffers) would be useful.  ... 
arXiv:1603.09303v2 fatcat:5j4ovt4rmbg45jxjwyhdi5ozva

Exploring the Role of Machine Learning in Scientific Workflows: Opportunities and Challenges [article]

Azita Nouri, Philip E. Davis, Pradeep Subedi, Manish Parashar
2021 arXiv   pre-print
Furthermore, we provide recommendations on how to extend ML techniques to unresolved challenges in the execution of scientific workflows.  ...  We explore the challenges of in-situ workflows and provide suggestions for improving the performance of their execution using ML techniques.  ...  [29] , using shared burst buffer, and local burst buffer.  ... 
arXiv:2110.13999v1 fatcat:va7e4uacafh5noqqywbk7svxzy
« Previous Showing results 1 — 15 out of 1,067 results