Filters








94 Hits in 6.1 sec

DI-MMAP—a scalable memory-map runtime for out-of-core data-intensive applications

Brian Van Essen, Henry Hsieh, Sasha Ames, Roger Pearce, Maya Gokhale
2013 Cluster Computing  
Finally, B. we demonstrate that DI-MMAP shows scalable out-of-core performance for BFS traversal in main memory constrained scenarios.  ...  Such scalable memory constrained performance would allow a system with a fixed amount of memory to solve a larger problem as well as provide memory QoS guarantees for systems running multiple data-intensive  ...  Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s)  ... 
doi:10.1007/s10586-013-0309-0 fatcat:xqor673yfzb5bb5tjvqhhvi4c4

DI-MMAP: A High Performance Memory-Map Runtime for Data-Intensive Applications

Brian Van Essen, Henry Hsieh, Sasha Ames, Maya Gokhale
2012 2012 SC Companion: High Performance Computing, Networking Storage and Analysis  
data access latency sensitive minimal computation Introduction DI-MMAP Runtime Experiments Conclusions Data-intensive memory-map runtime (DI-MMAP) A high-performance alternative to Linux mmap  ...  Experiments Conclusions Data-Intensive High-Performance Computing Data-Intensive Applications: large data sets large working sets that exceed capacity of main memory memory bound irregular  ...  Conclusions The data-intensive memory-map (DI-MMAP) runtime: 1. provides scalable, out-of-core performance for data-intensive applications 2. allows increased performance of algorithms with increased concurrency  ... 
doi:10.1109/sc.companion.2012.99 dblp:conf/sc/EssenHAG12 fatcat:kbjsu67jbreevm4fzxkyhp5fgm

UMap: Enabling Application-driven Optimizations for Page Management [article]

Ivy B. Peng, Marty McFadden, Eric Green, Keita Iwabuchi, Kai Wu, Dong Li, Roger Pearce, Maya Gokhale
2019 arXiv   pre-print
Memory mapping files on different tiers of storage provides a uniform interface in applications.  ...  By providing a data object abstraction layer, Umap is extensible to support various backing stores. The design of Umap supports dynamic load balancing and I/O decoupling for scalable performance.  ...  This research was also supported by the Exascale Computing Project (17-SC-20-SC), a collaborative eort of the U.S.  ... 
arXiv:1910.07566v1 fatcat:ettj6ppjujdejcy352idxja4mm

Design and Optimization of a Metagenomics Analysis Workflow for NVRAM

Sasha Ames, Jonathan E. Allen, David A. Hysom, G. Scott Lloyd, Maya B. Gokhale
2014 2014 IEEE International Parallel & Distributed Processing Symposium Workshops  
We present a novel metagenomic analysis pipeline that leverages emerging large address space compute nodes with NVRAM to hold a searchable, memory-mapped "k-mer" database of all known genomes and their  ...  To optimize query performance for the database, we present a twolevel index scheme that yields speedups of 8.4 ⇥ 74⇥ over a conventional hash table index.  ...  The use of memory map presents a convenient programming abstraction for performing out-of core data access without the overhead of application buffer management and mix of standard I/O and in-core data  ... 
doi:10.1109/ipdpsw.2014.200 dblp:conf/ipps/AmesAHLG14 fatcat:3cwkxgr5cvhzbanjmfbrq4uwt4

Integrated in-system storage architecture for high performance computing

Dries Kimpe, Kathryn Mohror, Adam Moody, Brian Van Essen, Maya Gokhale, Rob Ross, Bronis R. de Supinski
2012 Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers - ROSS '12  
We are hoping that once complete, our efforts with reduce the overheads of checkpointing and data movement across the system and thus improve the scalability and reliability of HPC applications.  ...  These efforts are being integrated around an I/O-intensive workload provided by the scalable checkpoint/restart (SCR) library.  ...  Accordingly, the United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable  ... 
doi:10.1145/2318916.2318921 fatcat:aqu2jjfogfcidi4d4f7mqejj6u

Compiler-Assisted Data Distribution and Network Configuration for Chip Multiprocessors

Yong Li, A. Abousamra, R. Melhem, A. K. Jones
2012 IEEE Transactions on Parallel and Distributed Systems  
At run time, symbolic MMAPs are resolved and used by a partitioning algorithm to choose a partition of allocated memory blocks among the forked threads in the analyzed application.  ...  Data access latency, a limiting factor in the performance of chip multiprocessors, grows significantly with the number of cores in non-uniform cache architectures with distributed cache banks.  ...  In addition, CAC avoids the need to keep track of connections to pin at runtime, which requires communication counters for each source destination pair and may not be scalable to large numbers of cores  ... 
doi:10.1109/tpds.2011.279 fatcat:nhq7im22xjazddsfmgrwdlxofe

DuVisor: a User-level Hypervisor Through Delegated Virtualization [article]

Jiahao Chen, Dingji Li, Zeyu Mi, Yuxuan Liu, Binyu Zang, Haibing Guan, Haibo Chen
2022 arXiv   pre-print
Evaluation on FireSim shows that DuVisor outperforms KVM by up to 47.96\% in a variety of real-world applications and significantly reduces the attack surface.  ...  driver from runtime intervention.  ...  Scaling Memory: To show DuVisor's memory scalability compared with KVM, we run STREAM [71], a memory- intensive benchmark, in a 4-vCPU guest VM with 512MB, 1024MB, 1536MB and 2048MB memory.  ... 
arXiv:2201.09652v1 fatcat:qtjgkef5krgf5obhrcrt7dr2ae

Slick: Secure Middleboxes using Shielded Execution [article]

Bohdan Trach, Alfred Krohmer, Sergei Arnautov, Franz Gregor, Pramod Bhatotia, Christof Fetzer
2019 arXiv   pre-print
Slick exposes a generic interface based on Click to design and implement a wide-range of NFs using its out-of-the box elements and C++ extensions.  ...  This motivated the design of Slick --- a secure middlebox framework for deploying high-performance Network Functions (NFs) on untrusted commodity servers.  ...  Instead of passing MAP HUGETLB ag to mmap(), it opens shared memory les in the hugetlbfs virtual lesystem and passes those le descriptors to mmap call, which is not protected by S . is mmap le-to-memory  ... 
arXiv:1709.04226v2 fatcat:z5dfluok3bfy3ic3vzzfr7g4nu

Exploring VM Introspection

Sahil Suneja, Canturk Isci, Eyal de Lara, Vasanth Bala
2015 SIGPLAN notices  
We present a comprehensive set of observations and best practices for efficient, accurate and consistent VMI operation based on our experiences with these techniques.  ...  Next we perform a thorough exploration of their trade-offs both qualitatively and quantitatively.  ...  We also thank Hao Chen for his insight during the initial phase of this work. This work is supported by an IBM Open Collaboration Research award.  ... 
doi:10.1145/2817817.2731196 fatcat:jkyulamqmbarpbyk3ybirtuhyq

Exploring VM Introspection

Sahil Suneja, Canturk Isci, Eyal de Lara, Vasanth Bala
2015 Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments - VEE '15  
We present a comprehensive set of observations and best practices for efficient, accurate and consistent VMI operation based on our experiences with these techniques.  ...  Next we perform a thorough exploration of their trade-offs both qualitatively and quantitatively.  ...  We also thank Hao Chen for his insight during the initial phase of this work. This work is supported by an IBM Open Collaboration Research award.  ... 
doi:10.1145/2731186.2731196 dblp:conf/vee/SunejaILB15 fatcat:eeylneoqbba6nh6ujhikisaf3i

Understanding object-level memory access patterns across the spectrum

Xu Ji, Chao Wang, Nosayba El-Sayed, Xiaosong Ma, Youngjae Kim, Sudharshan S. Vazhkudai, Wei Xue, Daniel Sanchez
2017 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17  
Especially we thank our shepherd, Simon Hammond, for his guidance and responsiveness. This work was supported in part by the National Key  ...  This makes judicious placement (for NUCA [6, 12, 26] , NUMA [3, 14] , emerging hybrid memory systems spanning multiple levels of heterogeneous memory hardware [16] , and out-of-core systems [57] )  ...  Such disparate data structure preference may motivate di erent design or optimization along the memory hierarchy, such as annotation by programmer or compiler for data placement [16, 39] , to assist runtime  ... 
doi:10.1145/3126908.3126917 dblp:conf/sc/JiWEMKVXS17 fatcat:nqmd4px5hfawfhp6itubchgjr4

Palacios and Kitten: New high performance operating systems for scalable virtualized and native supercomputing

John Lange, Kevin Pedretti, Trammell Hudson, Peter Dinda, Zheng Cui, Lei Xia, Patrick Bridges, Andy Gocke, Steven Jaconette, Mike Levenhagen, Ron Brightwell
2010 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)  
Palacios is a new open-source VMM under development at Northwestern University and the University of New Mexico that enables applications executing in a virtualized environment to achieve scalable high  ...  Additionally, Palacios leverages Kitten's simple memory management scheme to enable low-overhead pass-through of native devices to a virtualized environment.  ...  Kitten and Palacios together provide a scalable, flexible HPC system software platform that addresses the challenges laid out earlier and by others [6] .  ... 
doi:10.1109/ipdps.2010.5470482 dblp:conf/ipps/LangePHDCXBGJLB10 fatcat:pfcr3drdhzarxkvsuu436r33am

Reconstructing Hardware Transactional Memory for Workload Optimized Systems [chapter]

Kunal Korgaonkar, Prabhat Jain, Deepak Tomar, Kashyap Garimella, Veezhinathan Kamakoti
2011 Lecture Notes in Computer Science  
As an event that has taken place for 16 years, APPT aims at providing a high-quality program for all attendees. We accepted 13 papers out of 40 submissions, presenting an acceptance rate of 32.5%.  ...  To ensure a high-quality program and ensure interactive discussions, we made authors aware of the existence of a pre-filtering mechanism.  ...  Mostly, programmers only need to abstract an application into a Map and a Reduce phases, while letting the underlying runtime manage parallelism and data distribution.  ... 
doi:10.1007/978-3-642-24151-2_1 fatcat:32cx745cn5cfdm5sbeah6eyiey

Lessons learned from the early performance evaluation of Intel Optane DC Persistent Memory in DBMS [article]

Yinjun Wu, Kwanghyun Park, Rathijit Sen, Brian Kroth, Jaeyoung Do
2020 arXiv   pre-print
However, interacting with NVM requires changes to application software to best use the device (e.g. mmap and clflush of small cache lines instead of write and fsync of large page buffers).  ...  of traditional DRAM memory.  ...  In DAX mode, applications use PMem via memory semantics (i.e., load and store instructions), a er an initial interaction with the OS kernel through the mmap syscall to setup a virtual address space mapping  ... 
arXiv:2005.07658v1 fatcat:yip532aspvdvzkyaltqwh6idhq

A resistive TCAM accelerator for data-intensive computing

Qing Guo, Xiaochen Guo, Yuxin Bai, Engin İpek
2011 Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture - MICRO-44 '11  
average performance by 4⇥ and average energy consumption by 10⇥ on a set of evaluated data-intensive applications.  ...  Ternary content addressable memories (TCAM) hold the potential to address both problems in the context of a wide range of data-intensive workloads that benefit from associative search capability.  ...  The authors would like to thank Eby Friedman, Seung Kang, and anonymous reviewers for useful feedback.  ... 
doi:10.1145/2155620.2155660 dblp:conf/micro/GuoGBI11 fatcat:sun6egqax5bzjl3y5kvhpi3vz4
« Previous Showing results 1 — 15 out of 94 results