Filters








187 Hits in 5.9 sec

Throttling I/O Streams to Accelerate File-IO Performance [chapter]

Seetharami Seelam, Andre Kerstens, Patricia J. Teller
2007 Lecture Notes in Computer Science  
We call this mechanism file-I/O stream throttling.  ...  /O stream throttling.  ...  Oliker of LBNL for giving us access to MADbench and D. Skinner and the UTEP DAiSES team, particularly, Y. Kwok, S. Araunagiri, R. Portillo, and M. Ruiz, for their valuable feedback.  ... 
doi:10.1007/978-3-540-75444-2_67 fatcat:ivn3fummmveb3oy5j2p6tuyoue

H5hut: A high-performance I/O library for particle-based simulations

Mark Howison, Andreas Adelmann, E. Wes Bethel, Achim Gsell, Benedikt Oswald, Prabhat
2010 2010 IEEE International Conference On Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS)  
I/O and data management expertise.  ...  Achieving high-performance I/O for this data, effectively managing it on disk, and interfacing it with analysis and visualization tools can be challenging, especially for domain scientists who do not have  ...  ADIOS also features performance optimizations such as asynchronous I/O, which double buffers data and offloads I/O operations onto designated I/O threads, allowing a computational code to continue non-I  ... 
doi:10.1109/clusterwksp.2010.5613098 fatcat:zol4d2ltxzbqldooe3s25ew2xm

On the Scalability of Data Reduction Techniques in Current and Upcoming HPC Systems from an Application Perspective [chapter]

Axel Huebl, René Widera, Felix Schmitt, Alexander Matthes, Norbert Podhorszki, Jong Youl Choi, Scott Klasky, Michael Bussmann
2017 Lecture Notes in Computer Science  
for reduced I/O latency.  ...  Identifying throughput and overall I/O size as a major challenge for applications on today's and future HPC systems, we present a scaling law characterizing performance bottlenecks in state-of-the-art  ...  IO +t I/O .  ... 
doi:10.1007/978-3-319-67630-2_2 fatcat:xta7aa3pfrh4rlpfxu3ysk644u

Accelerating Network Communication and I/O in Scientific High Performance Computing Environments

Sarah Marie Neuwirth
2019
For example for large-scale application runs, POSIX I/O and MPI-IO can be improved by up to 50% on a per job basis, while HDF5 shows performance improvements of up to 32%.  ...  The solutions maximize the parallelization and throughput of file I/O. The frameworks are evaluated on the Titan supercomputing systems for three I/O interfaces.  ...  Acknowledgements Application: MPI_File_open(MPI_COMM_WORLD, "testfile", MPI_MODE_CREATE, info, fd) TAPP-IO  ... 
doi:10.11588/heidok.00025757 fatcat:6slijzy6k5hr5iiw443gbuvwe4

Cooperative GPGPU Scheduling for Consolidating Server Workloads

Yusuke SUZUKI, Hiroshi YAMADA, Shinpei KATO, Kenji KONO
2018 IEICE transactions on information and systems  
This paper presents GLoop, which is a software runtime that enables us to consolidate GPGPU apps including GPU eaters.  ...  Such highly functional GPGPU apps, referred to as GPU eaters, can easily monopolize a shared GPU and starve collocated GPGPU apps.  ...  The events include file I/O, network I/O, and GPU yields.  ... 
doi:10.1587/transinf.2018edp7027 fatcat:gmgxosap7nhxjanmgcv3uniw3y

Pfimbi: Accelerating big data jobs through flow-controlled data replication

Simbarashe Dzinamarira, Florin Dinu, T. S. Eugene Ng
2016 2016 32nd Symposium on Mass Storage Systems and Technologies (MSST)  
Pfimbi has numerous benefits: It accelerates jobs, exploits under-utilized storage I/O bandwidth, and supports hierarchical storage I/O bandwidth allocation policies.  ...  The performance of HDFS is critical to big data software stacks and has been at the forefront of recent efforts from the industry and the open source community.  ...  ACKNOWLEDGEMENT We would like to thank the anonymous reviewers for their thoughtful feedback.  ... 
doi:10.1109/msst.2016.7897074 dblp:conf/mss/DzinamariraDN16 fatcat:p7iiaaq6yvbozd24qfhxzbugnm

Capturing inter-application interference on clusters

Aamer Shah, Felix Wolf, Sergey Zhumatiy, Vladimir Voevodin
2013 2013 IEEE International Conference on Cluster Computing (CLUSTER)  
Unfortunately, traditional performance-analysis techniques consider an application always in isolation, without the ability to compare its performance to the overall performance conditions on the system  ...  Cluster systems usually run several applicationsoften from different users-concurrently, with individual applications competing for access to shared resources such as the file system or the network.  ...  We found file I/O to be a major catalyst of interference with I/O performance degraded by up to 50%.  ... 
doi:10.1109/cluster.2013.6702665 dblp:conf/cluster/ShahWZV13 fatcat:3bvs5pffm5hehg37wphekugnce

Offloading IDS Computation to the GPU

Nigel Jacob, Carla Brodley
2006 Proceedings of the Computer Security Applications Conference  
We propose a solution that off-loads some of the computation performed by the IDS to the Graphics Processing Unit (GPU).  ...  The results show that as the CPU load on the IDS host system increases, PixelSnort's performance is significantly more robust and is able to outperform conventional Snort by up to 40%.  ...  IDS Performance Issues Signature-matching intrusion detection systems have two types of performance limitations: 1) CPU-bound limitations that arise due to string-matching and 2) I/O-bound limitations  ... 
doi:10.1109/acsac.2006.35 dblp:conf/acsac/JacobB06 fatcat:7oofv6cfdrfnxp7pov5bzrvz2u

Cloud-Based FPGA Custom Computing Machines for Streaming Applications

Amran A. Al-Aghbari, Muhammad E. S. Elrabaa
2019 IEEE Access  
Its performance was 3-4x and ∼1.4-2.4x times better than an SW implementation on a VM and a powerful server, respectively.  ...  It allows the users to launch/use/tear down vFPGA-based CCMs in a similar manner to conventional virtual machines (VMs).  ...  A specially developed on-FPGA controller reformats the da-ta according to user specifications and follows the hardware I/O protocol.  ... 
doi:10.1109/access.2019.2906910 fatcat:edtlqvylejejbn3eufkhwwrrti

Decoupled Direct Memory Access: Isolating CPU and IO Traffic by Leveraging a Dual-Data-Port DRAM

Donghyuk Lee, Lavanya Subramanian, Rachata Ausavarungnirun, Jongmoo Choi, Onur Mutlu
2015 2015 International Conference on Parallel Architecture and Compilation (PACT)  
By doing so, our proposal increases the effective memory channel bandwidth, thereby either accelerating data transfers between system components, or providing opportunities to employ IO performance enhancement  ...  Our goal, in this work, is to improve system performance by eliminating memory channel contention between CPU accesses and IO accesses.  ...  Off-Loading I/O Management. I/O processor (IOP) [30] has been proposed to provide efficient interfaces between processors and IO devices.  ... 
doi:10.1109/pact.2015.51 dblp:conf/IEEEpact/LeeSACM15 fatcat:sm7bb67vnneyrkqerox66ck7ve

Sampling in Thermal Simulation of Processors: Measurement, Characterization, and Evaluation

Ehsan K. Ardestani, Francisco J. Mesa-Martinez, Gabriel Southern, Elnaz Ebrahimi, Jose Renau
2013 IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  
This paper aims to improve the accuracy and performance of sampled thermal simulation at the architectural level.  ...  To the best of our knowledge, this paper is the first to evaluate the impact of statistical sampling on thermal metrics through direct temperature measurements performed at runtime.  ...  Since all the SPEC applications are designed to be CPU bound, we complement them by also evaluating five workloads involving I/O: System Boot, Linux make, pdflatex, emacs, and BDB.  ... 
doi:10.1109/tcad.2013.2253156 fatcat:pv7yijd5pfcwrlvqmwcspb4qvu

Monocular Imaging-based Autonomous Tracking for Low-cost Quad-rotor Design - TraQuad [article]

Lakshmi Shrinivasan, Prasad N R
2018 arXiv   pre-print
This article describes the applications and advantages of TraQuad and the reduction in cost (to about 250) that has been achieved so far using the hardware and software capabilities and our custom algorithms  ...  We would like to thank Venkat of Edall Systems, Bangalore for generous assistance in programming with APM PWM RC Override input.  ...  Acknowledgments We would like to thank Erle Robotics team for the prompt assistance in correction of documentation of their Copter's SITL and also Vladimir Ermakov, leading developer of MAVROS for assistance  ... 
arXiv:1801.06847v1 fatcat:c64avpmw2ffcriibunguffbxou

Efficient I/O Virtualisation in Asymmetric Multiprocessor Architectures [article]

Chung Hwan Lee, University, The Australian National, University, The Australian National
2017
Server virtualisation is increasingly popular but still suffers from poor I/O performance. There are several methods to address the problem.  ...  One solution is the side-core approach to offload virtualisation I/O processing on a dedicated core, which offers close to bare-metal performance without sacrificing important virtualisation features.  ...  Single Root I/O Virtualisation (SR-IOV). As aforementioned, a PCI network device assigned to a domU using the direct IO technique cannot be shared.  ... 
doi:10.25911/5d70f12835970 fatcat:on4ppzijhfdrfdgdijfthqjo4m

A Stream Processing Framework for On-Line Optimization of Performance and Energy Efficiency on Heterogeneous Systems

Benjamin Ranft, Oliver Denninger, Philip Pfaffe
2014 2014 IEEE International Parallel & Distributed Processing Symposium Workshops  
Scheduling is automatically adapted on-line to continuously optimize performance and energy efficiency.  ...  Modern processors have the potential of executing compute-intensive programs quickly and efficiently, but require applications to be adapted to their ever increasing parallelism.  ...  It is nevertheless very useful in conjunction with I/O-related processing steps such as reading from disk, receiving sensor data, displaying results or connecting to a middleware [43] .  ... 
doi:10.1109/ipdpsw.2014.119 dblp:conf/ipps/RanftDP14 fatcat:7yca7spnsbbprmk67o4yvmgcxe

Automatic Parallelism Tuning Mechanism for Heterogeneous IP-SAN Protocols in Long-fat Networks

Takamichi Nishijima, Hiroyuki Ohsaki, Makoto Imase
2013 Journal of Information Processing  
A block device layer is a layer that receives read/write requests from an application or a file system, and relays those requests to a storage device.  ...  We evaluate the performance of BDL-APT with heterogeneous IP-SAN protocols (NBD, GNBD and iSCSI) in a long-fat network.  ...  For every bio structure object, the pointer to a callback function can be specified in the filed bio end io of the bio structure object. structure objects so that the total block I/O response size can  ... 
doi:10.2197/ipsjjip.21.423 fatcat:jdjjzcwmibgxvb33vcdecqvud4
« Previous Showing results 1 — 15 out of 187 results