115,155 Hits in 8.5 sec

Scaling a file system to many cores using an operation log

Srivatsa S. Bhat, Rasha Eqbal, Austin T. Clements, M. Frans Kaashoek, Nickolai Zeldovich
2017 Proceedings of the 26th Symposium on Operating Systems Principles - SOSP '17  
ScaleFS is a novel file system design that decouples the in-memory file system from the on-disk file system using per-core operation logs.  ...  ScaleFS logs operations in a per-core log so that it can delay propagating updates to the disk representation (and the cache-line conflicts involved in doing so) until an fsync.  ...  INTRODUCTION Many of today's file systems do not scale well on multicore machines, and much effort is spent on improving them to allow file-system-intensive applications to scale better [4, 10, 13, 23  ... 
doi:10.1145/3132747.3132779 dblp:conf/sosp/BhatECKZ17 fatcat:dw4qvexzhbhl7khvs75m6lcrru

NrOS: Effective Replication and Sharing in an Operating System

Ankit Bhardwaj, Chinmay Kulkarni, Reto Achermann, Irina Calciu, Sanidhya Kashyap, Ryan Stutsman, Amy Tai, Gerd Zellweger
2021 USENIX Symposium on Operating Systems Design and Implementation  
NrOS replicates kernel state on each NUMA node and uses operation logs to maintain strong consistency between replicas.  ...  This kernel is scaled across NUMA nodes using node replication, a scheme inspired by state machine replication in distributed systems.  ...  Chinmay Kulkarni is supported by a Google PhD Fellowship.  ... 
dblp:conf/osdi/Bhardwaj0ACKSTZ21 fatcat:s7dunts2yrgetof65wpczb3tb4

Understanding Manycore Scalability of File Systems

Changwoo Min, Sanidhya Kashyap, Steffen Maass, Taesoo Kim
2016 USENIX Annual Technical Conference  
We draw a set of observations on file system scalability behavior and unveil several core aspects of file system design that systems researchers must address.  ...  We found 25 scalability bottlenecks in file systems, many of which are unexpected or counterintuitive.  ...  ScaleFS [41] extends a scalable in-memory file system to support consistency by using operation log on an on-disk file system.  ... 
dblp:conf/usenix/MinKMK16 fatcat:luafxuyjbbaubcjd2tews4brca

Parallelizing Shared File I/O operations of NVM File System for Manycore Servers

June-Hyung Kim, Youngjae Kim, Safdar Jamil, Chang-Gyu Lee, Sungyong Park
2021 IEEE Access  
INDEX TERMS Operating system, file system, non-volatile memory, manycore CPU.  ...  NOVA, a state-of-the-art non-volatile memory (NVM) file system, has limited performance due to its coarse-grained per-file lock when multiple threads perform I/Os to a shared file in a manycore environment  ...  It is mainly used in a distributed file system where large data or metadata files are shared by many threads.  ... 
doi:10.1109/access.2021.3054905 fatcat:7z44pkfbkbgdfab2dbey3f6lhu

SLoG: Large-Scale Logging Middleware for HPC and Big Data Convergence

Pierre Matri, Philip Carns, Robert Ross, Alexandru Costan, Maria S. Perez, Gabriel Antoniu
2018 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS)  
We present SLoG, a shared log middleware providing a shared log abstraction over a parallel file system, designed to circumvent the aforementioned limitations.  ...  In contrast, HPC developers have a much more limited choice, typically restricted to a centralized parallel file system for persistent storage.  ...  Experiments presented were carried out using resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility (Contract DE-AC02-06CH11357).  ... 
doi:10.1109/icdcs.2018.00156 dblp:conf/icdcs/MatriCRCPA18 fatcat:n54mdkdsdndnpe6wdei3afjfoe

High Performance Computing: Infrastructure, Application, and Operation

Byung-Hoon Park, Youngjae Kim, Byoung-Do Kim, Taeyoung Hong, Sungjun Kim, John K. Lee
2012 Journal of Computing Science and Engineering  
To introduce key aspects of HPC to a broader community, an HPC session was organized for the first time ever for the United States and Korea Conference (UKC) during 2012.  ...  This paper summarizes four invited talks that each covers scientific HPC applications, large-scale parallel file systems, administration/maintenance of supercomputers, and green technology towards building  ...  Current processor technology is moving fast beyond the era of multi-core towards many-core on-chip. The Intel 80 core chip is an attempt at many-core single chip powered data centers.  ... 
doi:10.5626/jcse.2012.6.4.280 fatcat:qwx3kjttrbe2vhwvhmjbzl35t4

SpanFS: A Scalable File System on Fast Storage Devices

Junbin Kang, Benlong Zhang, Tianyu Wo, Weiren Yu, Lian Du, Shuai Ma, Jinpeng Huai
2015 USENIX Annual Technical Conference  
To scale file systems to many cores, we propose SpanFS, a novel file system which consists of a collection of micro file system services called domains.  ...  SpanFS is implemented based on the Ext4 file system. Experimental results evaluating SpanFS against Ext4 on a modern PCI-E SSD show that SpanFS scales much better than Ext4 on a 32-core machine.  ...  Acknowledgements We would like to thank our shepherd Chia-Lin Yang and the anonymous reviewers for their valuable suggestions that help improve this paper significantly. Junbin  ... 
dblp:conf/usenix/KangZWYDMH15 fatcat:jk6b3vxynneb5j5xqrihuuiv2i

Modular HPC I/O Characterization with Darshan

Shane Snyder, Philip Carns, Kevin Harms, Robert Ross, Glenn K. Lockwood, Nicholas J. Wright
2016 2016 5th Workshop on Extreme-Scale Programming Tools (ESPT)  
These large-scale HPC platforms employ increasingly complex I/O subsystems to provide a suitable level of I/O performance to applications.  ...  Tuning I/O workloads for such a system is nontrivial, and the results generally are not portable to other HPC systems.  ...  Rather than logging every I/O operation submitted by an application (as a tracing tool would), Darshan captures a bounded amount of data for each file opened by the application, including I/O operation  ... 
doi:10.1109/espt.2016.006 dblp:conf/sc/SnyderCHRLW16 fatcat:h22nvvifindyxlvcrzmca3hpee

LDPLFS: Improving I/O Performance without Application Modification

S.A. Wright, S.D. Hammond, S.J. Pennycook, I. Miller, J.A. Herdman, S.A. Jarvis
2012 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum  
This method employs a dynamic library to intercept the low-level POSIX operations and retarget them to use the equivalents offered by PLFS.  ...  In order to address the growing divergence between processing speeds and I/O performance, the Parallel Log-structured File System (PLFS) has been developed by EMC Corporation and the Los Alamos National  ...  Sandia National Laboratories is a multiprogram laboratory managed and operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy's National Nuclear Security Administration  ... 
doi:10.1109/ipdpsw.2012.172 dblp:conf/ipps/WrightHPMHJ12 fatcat:j33prdmqhffjtmtix67x5elzfu


Jongyul Kim, Insu Jang, Waleed Reda, Jaeseong Im, Marco Canini, Dejan Kostić, Youngjin Kwon, Simon Peter, Emmett Witchel
2021 Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles CD-ROM  
In multi-tenant systems, the CPU overhead of distributed file systems (DFSes) is increasingly a burden to application performance.  ...  To fully leverage the SmartNIC architecture, we decompose DFS operations into execution stages that can be offloaded to a parallel datapath execution pipeline on the SmartNIC.  ...  This work is supported by an  ... 
doi:10.1145/3477132.3483565 fatcat:ohpsqrgfafeqbmaal7hqgycgrm

Designing New Operating Primitives to Improve Fuzzing Performance

Wen Xu, Sanidhya Kashyap, Changwoo Min, Taesoo Kim
2017 Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security - CCS '17  
We observe that AFL, a state-of-the-art fuzzer, slows down by 24× because of file system contention and the scalability of fork() system call when it runs on 120 cores in parallel.  ...  Current research on fuzzing has focused on producing an input that is more likely to trigger a vulnerability.  ...  In addition, prior works have used in-memory file systems to hide the file system overhead; instead we use it in the form of two-level caching to provide a required file system interface as well as the  ... 
doi:10.1145/3133956.3134046 dblp:conf/ccs/XuKMK17 fatcat:o2pkjtqjgzhqnnhzojifghhdo4

Challenges to Error Diagnosis in Hadoop Ecosystems

Jim Zhanwen Li, Siyuan He, Liming Zhu, Xiwei Xu, Min Fu, Len Bass, Anna Liu, An Binh Tran
2013 USENIX Large Installation Systems Administration Conference  
Deploying a large-scale distributed ecosystem such as HBase/Hadoop in the cloud is complicated and error-prone.  ...  These errors are difficult to diagnose because of scattered log management and lack of ecosystem-awareness in many diagnosis tools and processes.  ...  We experimented and demonstrated the feasibility of the approach using a small set of common Hadoop ecosystem errors.  ... 
dblp:conf/lisa/LiHZXFBLT13 fatcat:b3pqyvmicnfj3lwiwtdx3f6g7u

Quantifying the impact of frequency scaling on the energy efficiency of the single-chip cloud computer

A. Bartolini, MohammadSadegh Sadri, J. Furst, A. K. Coskun, L. Benini
2012 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE)  
Single-chip many-core systems bring new challenges owing to the large number of operating points and the shift to message passing interface (MPI) from shared memory communication.  ...  This paper evaluates the impact of frequency scaling on the performance and power of many-core systems with MPI.  ...  Therefore, compared to multinode MPI-based systems, scaling the frequency of cores in a single-chip many-core system with MPI has higher impact due to the stronger coupling of communication characteristics  ... 
doi:10.1109/date.2012.6176459 dblp:conf/date/BartoliniSFCB12 fatcat:trznwxxyhrbxjjmpthbzm3l45e


Kaushik Veeraraghavan, Dongyoon Lee, Benjamin Wester, Jessica Ouyang, Peter M. Chen, Jason Flinn, Satish Narayanasamy
2011 Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems - ASPLOS '11  
We evaluate DoublePlay on a variety of client, server, and scientific parallel benchmarks; with spare cores, DoublePlay reduces logging overhead to an average of 15% with two worker threads and 28% with  ...  Deterministic replay systems record and reproduce the execution of a hardware or software system.  ...  Offline replay To support offline replay, DoublePlay records the system calls and synchronization operations executed during an epoch in a set of log files (for simplicity, DoublePlay uses a separate log  ... 
doi:10.1145/1950365.1950370 dblp:conf/asplos/VeeraraghavanLWOCFN11 fatcat:bmpmrkacfneihnd2citifv3egu

Concurrent log-structured memory for many-core key-value stores

Alexander Merritt, Ada Gavrilovska, Yuan Chen, Dejan Milojicic
2017 Proceedings of the VLDB Endowment  
As many applications benefit from having as much of their working state fit into main memory, an important design of the memory management of modern key-value stores is the use of log-structured approaches  ...  for insertion operations to avoid contending for centralized resources such as the log head and memory pools.  ...  We execute an instance of Postmark within a file system created by TableFS, a storage layer that uses a key-value store within FUSE to implement file system operations. Trace characteristics.  ... 
doi:10.1145/3186728.3164142 fatcat:koi4zcgbl5b2npourxuswojj3q
« Previous Showing results 1 — 15 out of 115,155 results