Filters








110 Hits in 9.5 sec

Approaching DRAM performance by using microsecond-latency flash memory for small-sized random read accesses: a new access method and its graph applications

Tomoya Suzuki, Kazuhiro Hiwada, Hirotsugu Kajihara, Shintaro Sano, Shuou Nomura, Tatsuo Shiozawa
2021 Proceedings of the VLDB Endowment  
In our evaluation, the large graph data is placed on microsecond-latency flash memories within prototype boards, and it is accessed by the proposed method.  ...  For applications in which small-sized random accesses frequently occur for datasets that exceed DRAM capacity, placing the datasets on SSD can result in poor application performance.  ...  In this paper, we present a new solution for small-sized random read access using flash memory rather than byte-addressable SCM.  ... 
dblp:journals/pvldb/SuzukiHKSNS21 fatcat:otvkqlag6zc4npqrdiqtzo374i

Providing safe, user space access to fast, solid state disks

Adrian M. Caulfield, Todor I. Mollov, Louis Alex Eisner, Arup De, Joel Coburn, Steven Swanson
2012 SIGARCH Computer Architecture News  
We evaluate the performance of the system using a suite of microbenchmarks and database workloads and show that the new interface improves latency and bandwidth for 4 KB writes by 60% and 7.2⇥, respectively  ...  Emerging fast, non-volatile memories (e.g., phase change memories, spin-torque MRAMs, and the memristor) reduce storage access latencies by an order of magnitude compared to state-of-the-art flash-based  ...  We would also like to thank the reviewers for their feedback and suggestions and the Xilinx University Program for their support. This work was supported by NSF awards CCF-1018672 and OCI-0910847.  ... 
doi:10.1145/2189750.2151017 fatcat:diyttz27d5harogj3wegw5w6qe

Providing safe, user space access to fast, solid state disks

Adrian M. Caulfield, Todor I. Mollov, Louis Alex Eisner, Arup De, Joel Coburn, Steven Swanson
2012 Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS '12  
We evaluate the performance of the system using a suite of microbenchmarks and database workloads and show that the new interface improves latency and bandwidth for 4 KB writes by 60% and 7.2⇥, respectively  ...  Emerging fast, non-volatile memories (e.g., phase change memories, spin-torque MRAMs, and the memristor) reduce storage access latencies by an order of magnitude compared to state-of-the-art flash-based  ...  We would also like to thank the reviewers for their feedback and suggestions and the Xilinx University Program for their support. This work was supported by NSF awards CCF-1018672 and OCI-0910847.  ... 
doi:10.1145/2150976.2151017 dblp:conf/asplos/CaulfieldMEDCS12 fatcat:kb2ql3kpuzhihoyl6rpsy4dg7q

Designing a True Direct-Access File System with DevFS

Sudarsun Kannan, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Yuangang Wang, Jun Xu, Gopinath Palani
2018 USENIX Conference on File and Storage Technologies  
A novel reverse-caching mechanism enables the usage of host memory for inactive objects, thus reducing memory load upon the device.  ...  Evaluation of an emulated DevFS prototype shows more than 2x higher I/O throughput with direct access and up to a 5x reduction in device RAM utilization.  ...  This material was supported by funding from NSF grants CNS-1421033 and CNS-1218405, and DOE grant DE-SC0014935.  ... 
dblp:conf/fast/KannanAAWXP18 fatcat:adjvdlis5vff7cknlygxhzqw3m

Prototyping a hybrid main memory using a virtual machine monitor

Dong Ye, Aravind Pavuluri, Carl A. Waldspurger, Brian Tsang, Bohuslav Rychlik, Steven Woo
2008 2008 IEEE International Conference on Computer Design  
We use a novel virtualization-based approach for computer architecture performance analysis.  ...  We present a case study analyzing a hypothetical hybrid main memory, which consists of a first-level DRAM augmented by a 10-100x slower second-level memory.  ...  Thanks to Yiu Cho Lau for his automation scripts and the VMware performance group for their help.  ... 
doi:10.1109/iccd.2008.4751873 dblp:conf/iccd/YePWTRW08 fatcat:rqspxae6rjbgnmc56hhqob3b24

BigSparse: High-performance external graph analytics [article]

Sang-Woo Jun, Andy Wright, Sizhuo Zhang, Shuotao Xu, Arvind
2017 arXiv   pre-print
In our experiments on a server with 32GB to 64GB of DRAM, BigSparse outperforms other in-memory and semi-external graph analytics systems for algorithms such as PageRank, BreadthFirst Search, and Betweenness-Centrality  ...  for terabyte-size graphs with billions of vertices.  ...  V .new val = vertex program(V .new val, tmp) end for Update omitted Our BigSparse Architecture addresses the issue of finegrained random accesses using a method we call Sort-Reduce.  ... 
arXiv:1710.07736v1 fatcat:3rlvy45dtzazjecsqiilg2vgeu

Hardware-Accelerated Platforms and Infrastructures for Network Functions: A Survey of Enabling Technologies and Research Studies

Prateek Shantharama, Akhilesh S. Thyagaturu, Martin Reisslein
2020 IEEE Access  
second, memory access latency, cache and memory read and write performance, as well as I/O behaviors.  ...  The CPU uses normal load and store instructions that are used for DRAM-access to access the PM NVDIMM memory.  ... 
doi:10.1109/access.2020.3008250 fatcat:kv4znpypqbatfk2m3lpzvzb2nu

Physically addressed queueing (PAQ)

Myoungsoo Jung, Ellis H. Wilson, Mahmut Kandemir
2012 SIGARCH Computer Architecture News  
We implement PAQ in a cycle-accurate simulator and demonstrate bandwidth and IOPS improvements greater than 62% and latency decreases as much as 41.6% for random reads, without degrading performance of  ...  NAND flash storage has proven to be a competitive alternative to traditional disk for its properties of high randomaccess speeds, low-power and its presumed efficacy for random-reads.  ...  This work is supported in part by NSF grants 1017882, 0937949, and 0833126 and DOE grant DE-SC0002156.  ... 
doi:10.1145/2366231.2337206 fatcat:cs3kys3lwzhj7i5tglvyaeodea

Reaping the performance of fast NVM storage with uDepot

Kornilios Kourtis, Nikolas Ioannou, Ioannis Koltsidas
2019 USENIX Conference on File and Storage Technologies  
Many applications require low-latency key-value storage, a requirement that is typically satisfied using key-value stores backed by DRAM.  ...  structure that dynamically adjusts its DRAM footprint to match the inserted items, and employs a novel task-based IO run-time system to maximize performance, enabling applications to use fast NVM devices  ...  Finally, we would like to thank Intel for providing early access to an Optane testbed.  ... 
dblp:conf/fast/KourtisIK19 fatcat:w2pdgjywbrahzgsycfkqmppzje

The CacheLib Caching Engine: Design and Experiences at Scale

Benjamin Berg, Daniel S. Berger, Sara McAllister, Isaac Grosof, Sathya Gunasekar, Jimmy Lu, Michael Uhlar, Jim Carrig, Nathan Beckmann, Mor Harchol-Balter, Gregory R. Ganger
2020 USENIX Symposium on Operating Systems Design and Implementation  
Commonly, each cache is implemented and maintained independently by a distinct team and is highly specialized to its function.  ...  For example, an application-data cache would be independent from a CDN cache.  ...  Acknowledgements This work is supported by NSF-CMMI-1938909, NSF-CSR-1763701, NSF-XPS-1629444, a 2020 Google Faculty Research Award, and a Facebook Graduate Fellowship.  ... 
dblp:conf/osdi/BergBMGGLUCBHG20 fatcat:hrkka4wk55chfha77yaesspidq

Unification of Temporary Storage in the NodeKernel Architecture

Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, Ana Klimovic, Adrian Schüpbach, Bernard Metzler
2019 USENIX Annual Technical Conference  
NodeKernel provides hierarchical naming, high scalability, and close to bare-metal performance for a wide range of data sizes and access patterns that are characteristic of temporary data.  ...  by up to 4.8× and Spark application performance by up to 3.4×.  ...  Acknowledgments We thank our shepherd, Michael Swift, and the anonymous Usenix ATC reviewers for their helpful feedback.  ... 
dblp:conf/usenix/StuediTPKSM19 fatcat:sfw634jq6jhyrcg75hj5tpbfbm

The RAMCloud Storage System

John Ousterhout, Mendel Rosenblum, Stephen Rumble, Ryan Stutsman, Stephen Yang, Arjun Gopalan, Ashish Gupta, Ankita Kejriwal, Collin Lee, Behnam Montazeri, Diego Ongaro, Seo Jin Park (+1 others)
2015 ACM Transactions on Computer Systems  
It uses a uniform logstructured mechanism to manage both DRAM and secondary storage, which results in high performance and efficient memory usage.  ...  In many cases, DRAM is used as a cache for some other storage system, such as a database; this approach forces developers to manage consistency between the cache and the backing store, and its performance  ...  Our goal for RAMCloud is to achieve the lowest possible latency for small random accesses in large-scale applications; at the time of publication, this was around 5μs for small clusters and 10μs in a large  ... 
doi:10.1145/2806887 fatcat:fg3r5yahbjhxhcor6m2w2q6bxy

SmartSAGE: Training Large-scale Graph Neural Networks using In-Storage Processing Architectures [article]

Yunjae Lee, Jinha Chung, Minsoo Rhu
2022 arXiv   pre-print
Given the large performance gap between DRAM and SSD, however, blindly utilizing SSDs as a direct substitute for DRAM leads to significant performance loss.  ...  without being hampered by the physical limitations of main memory size.  ...  Program of the NRF funded by the Korea government MSIT under grant NRF-2020M3H6A1085498, and by Samsung Electronics Co., Ltd (IO201210-07974-01).  ... 
arXiv:2205.04711v1 fatcat:nvgvsja7r5c4zclfx6dvzx526q

Boosting random write performance for enterprise flash storage systems

Tao Xie, Janak Koshia
2011 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)  
In this thesis, to boost flash SSD random write performance, we develop a new cache management scheme called element-level parallel optimization (EPO), which buffers and reorders write requests so that  ...  Koshia Master of Science in Computer Science San Diego State University, 2011 NAND flash memory is playing a key role in the revolution of storage systems due to its desirable features such as fast random  ...  I would also like to thank the other members of the committee for their time and effort.  ... 
doi:10.1109/msst.2011.5937226 dblp:conf/mss/XieK11 fatcat:7sx4qebbobeirhmrkd263lwlte

Optimizing the Block I/O Subsystem for Fast Storage Devices

Young Jin Yu, Dong In Shin, Woong Shin, Nae Young Song, Jae Woo Choi, Hyeong Seog Kim, Hyeonsang Eom, Heon Young Yeom
2014 ACM Transactions on Computer Systems  
Our optimization principles are 1) minimizing per-request overhead by redesigning I/O path, and 2) mitigating per-request overhead by using a request batching scheme.  ...  Recent development of storage devices has been driven by advanced memory technology, which shifts the paradigm of data access mechanism from magnetics and mechanics to electronics.  ...  However, the benefit is exploited only by a large-sized request; if discontiguous small requests are dispatched to a storage device one by one, the concurrent access to flash chips would hardly occur,  ... 
doi:10.1145/2619092 fatcat:4qkksnb2onbohfz7aiha2p6mau
« Previous Showing results 1 — 15 out of 110 results