Filters








25,749 Hits in 3.9 sec

Memory-manager/scheduler co-design

Sapan Bhatia, Charles Consel, Julia Lawall
2006 Proceedings of the 2006 international symposium on Memory management - ISMM '06  
We propose a novel memory manager combined with a tailored scheduling strategy to restrict the working data set of the program to a memory region mapped directly into the data cache.  ...  Event-driven programming has emerged as a standard to implement high-performance servers due to its flexibility and low OS overhead. Still, memory access remains a bottleneck.  ...  Benchmarking the TUX and thttpd web servers has shown that data-cache misses are reduced by up to 75%, and the overall throughput of the server increases by up to 38%.  ... 
doi:10.1145/1133956.1133971 dblp:conf/iwmm/BhatiaCL06 fatcat:v5nnoupacbgn7kwmxrwwqhe7oy

Automatic Skeleton-Driven Memory Affinity for Transactional Worklist Applications

Luís Fabrício Wanderley Góes, Christiane Pousa Ribeiro, Márcio Castro, Jean-François Méhaut, Murray Cole, Marcelo Cintra
2013 International journal of parallel programming  
First, it addresses memory affinity in the DRAM level by automatic selecting page allocation policies. Then it employs data prefetching helper threads to improve affinity in the cache level.  ...  In this paper, we thus propose a skeleton-driven mechanism to improve memory affinity on STM applications that fit the worklist pattern employing a two-level approach.  ...  However, typical STM systems are decoupled from issues of thread and memory management, leading to poor exploitation of memory affinity by the native operating system.  ... 
doi:10.1007/s10766-013-0253-x fatcat:6n2g3boe3rfo3nnwyfdkjymkbu

Power aware page allocation

Alvin R. Lebeck, Xiaobo Fan, Heng Zeng, Carla Ellis
2000 SIGPLAN notices  
Memory is a particularly important target for e orts to improve energy e ciency.  ...  We perform experiments using two complementary simulation environments: a tracedriven simulator with workload traces that are r epresentative of mobile computing and an execution-driven simulator with  ...  This work supported in part by D ARPA Grant DABT63-98-1-0001, NSF Grants CDA-97-2637, CDA-95-12356, EIA-99-72879, EIA-99-86024, NSF CAREER Award MIP-97-02547, Duke University, and equipment donations from  ... 
doi:10.1145/356989.356999 fatcat:u6wpzhayubhrhe2jqwm4g4bvnm

Power aware page allocation

Alvin R. Lebeck, Xiaobo Fan, Heng Zeng, Carla Ellis
2000 SIGARCH Computer Architecture News  
Memory is a particularly important target for e orts to improve energy e ciency.  ...  We perform experiments using two complementary simulation environments: a tracedriven simulator with workload traces that are r epresentative of mobile computing and an execution-driven simulator with  ...  This work supported in part by D ARPA Grant DABT63-98-1-0001, NSF Grants CDA-97-2637, CDA-95-12356, EIA-99-72879, EIA-99-86024, NSF CAREER Award MIP-97-02547, Duke University, and equipment donations from  ... 
doi:10.1145/378995.379007 fatcat:qnqgxv2whfarjjgzxwyqhk7mva

Power aware page allocation

Alvin R. Lebeck, Xiaobo Fan, Heng Zeng, Carla Ellis
2000 ACM SIGOPS Operating Systems Review  
Memory is a particularly important target for e orts to improve energy e ciency.  ...  We perform experiments using two complementary simulation environments: a tracedriven simulator with workload traces that are r epresentative of mobile computing and an execution-driven simulator with  ...  This work supported in part by D ARPA Grant DABT63-98-1-0001, NSF Grants CDA-97-2637, CDA-95-12356, EIA-99-72879, EIA-99-86024, NSF CAREER Award MIP-97-02547, Duke University, and equipment donations from  ... 
doi:10.1145/384264.379007 fatcat:hhwavknlufh6fez7u32zt6y3py

Power aware page allocation

Alvin R. Lebeck, Xiaobo Fan, Heng Zeng, Carla Ellis
2000 Proceedings of the ninth international conference on Architectural support for programming languages and operating systems - ASPLOS-IX  
Memory is a particularly important target for e orts to improve energy e ciency.  ...  We perform experiments using two complementary simulation environments: a tracedriven simulator with workload traces that are r epresentative of mobile computing and an execution-driven simulator with  ...  This work supported in part by D ARPA Grant DABT63-98-1-0001, NSF Grants CDA-97-2637, CDA-95-12356, EIA-99-72879, EIA-99-86024, NSF CAREER Award MIP-97-02547, Duke University, and equipment donations from  ... 
doi:10.1145/378993.379007 fatcat:ysngeu4hjrbwvgy2digecmi5we

Cache simulator based on GPU acceleration

Wan Han, Gao Xiaopeng, Wang Zhiqiang
2009 Proceedings of the Second International ICST Conference on Simulation Tools and Techniques  
Our experimental result shows that the new algorithm gains 2.5x performance improvement compared to traditional CPU-based serial algorithm.  ...  Cache technology plays a fundamental role in modern computer systems as it serves the purpose of matching the speed gap between processor and memory.  ...  ACKNOWLEDGMENTS Our thanks to the support provided by the National High Technology Research and Development Program (2007AA01Z183).  ... 
doi:10.4108/icst.simutools2009.5562 dblp:conf/simutools/HanXZ09 fatcat:x2aprvs52nhvzei3e2famwaddu

Scheduling Algorithms with Bus Bandwidth Considerations for SMPs [chapter]

Christos D. Antonopoulos, Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou
2006 High-Performance Computing  
The necessary information is provided by performance monitoring counters which are present in all modern processors.  ...  The new scheduling policies improve system throughput by up to 68% (26% in average) in comparison with the standard Linux scheduler.  ...  The goal has always been to optimize the programs for the memory hierarchy and improve cache locality.  ... 
doi:10.1002/0471732710.ch16 fatcat:oq2pxgeq2bbmnclpyngg5tw4k4

Scheduling algorithms with bus bandwidth considerations for SMPs

C.D. Antonopoulos, D.S. Nikolopoulos, T.S. Papatheodorou
2003 2003 International Conference on Parallel Processing, 2003. Proceedings.  
The necessary information is provided by performance monitoring counters which are present in all modern processors.  ...  The new scheduling policies improve system throughput by up to 68% (26% in average) in comparison with the standard Linux scheduler.  ...  The goal has always been to optimize the programs for the memory hierarchy and improve cache locality.  ... 
doi:10.1109/icpp.2003.1240622 dblp:conf/icpp/AntonopoulosNP03 fatcat:vqmmhe3ztfhzxbx5ymk5fn7q54

Memory-Side Acceleration for XML Parsing [chapter]

Jie Tang, Shaoshan Liu, Zhimin Gu, Chen Liu, Jean-Luc Gaudiot
2011 Lecture Notes in Computer Science  
Our results show that this technique is able to improve performance by up to 20% as well as produce up to 12.77% of energy saving when implemented in 32 nm technology.  ...  In this paper, we analyze the performance of XML parsing, identify that a significant fraction of the performance overhead is indeed incurred by memory data loading.  ...  Performance improvement for DOM parsing 7 Implementation Feasibility By now we have shown that memory-side acceleration can significantly improve XML parsing performance.  ... 
doi:10.1007/978-3-642-24403-2_22 fatcat:ptqjlauly5cnvou4znsr736bie

Improving Cloud System Performances by Adopting Nvram-Based Storage Systems

Jisun Kim, Yunjoo Park, Sunhwa A Nam, Hyunkyoung Choi, KyungWoon Cho, Hyokyung Bahn
2016 International Journal of Natural Sciences Research  
We first consider NVRAM as an additional storage cache and show how much performance improvement can be obtained if we adopt NVRAM cache.  ...  However, in case of memory-intensive workload, NVRAM memory significantly improves the performance, and NVRAM swap also gains a certain level of improvement.  ...  However, in case of memory-intensive workload, NVRAM memory significantly improves the performance of the system as it extends the effective memory capacity.  ... 
doi:10.18488/journal.63/2016.4.6/63.6.100.106 fatcat:susdlad3ubacvcydxz3y64wpdq

CARAM: A Content-Aware Hybrid PCM/DRAM Main Memory System Framework [article]

Yinjin Fu
2020 arXiv   pre-print
It also substantially extends available free memory space by coalescing redundant lines in hybrid memory, thereby further improving the wear-leveling efficiency of PCM.  ...  To obtain high data access performance, we also design a set of acceleration techniques to minimize the overhead caused by extra computation costs.  ...  Acknowledgments This research was supported by the NSF-Jiangsu grant BK20191327. We would like to thank Prof. Patrick P.C. Lee and Dr. Yang Wu for their help on the initial design of the system.  ... 
arXiv:2007.13661v1 fatcat:pkwbsjdt4nfztaoaatofocrb6a

Locality-Driven Dynamic GPU Cache Bypassing

Chao Li, Shuaiwen Leon Song, Hongwen Dai, Albert Sidelnik, Siva Kumar Sastry Hari, Huiyang Zhou
2015 Proceedings of the 29th ACM on International Conference on Supercomputing - ICS '15  
Existing GPU cache management schemes are either based on conditional/reactive solutions or hit-rate based designs specifically developed for CPU last level caches, which can limit overall performance.  ...  Results show that our proposed design can dramatically reduce cache contention and achieve up to 56.8% and an average of 30.3% performance improvement over the baseline architecture, for a range of highly-optimized  ...  GPU cache management: Previous work [22, 24, 17, 20] has made efforts on improving cache performance by changing the warp scheduling policies.  ... 
doi:10.1145/2751205.2751237 dblp:conf/ics/LiSDSHZ15 fatcat:6l4xtl2wqjb5ta7xptompqxfiy

A Shell Script A Shell Script - CleanerCleaner

Riya Patil
2019 International Journal for Research in Applied Science and Engineering Technology  
In Linux system the cleaning of all the cache and temporary memory is a complex task done by command line. Memory management in any operating system is considered as complex task.  ...  Thus results in better memory management and cache cleaning techniques.  ...  As the shell acript cleaner integrated with distribution specific commands all the cache and junk files gets clean quickly, so this will help linux memory management and improve performance.  ... 
doi:10.22214/ijraset.2019.5658 fatcat:5jp6rrfdbrfovmt2se75i53h2e

A selective compressed memory system by on-line data decompressing

Jang-Soo Lee, Won-Kee Hong, Shin-Dug Kim
1999 Proceedings 25th EUROMICRO Conference. Informatics: Theory and Practice for the New Millennium  
The selective technique compression can reduce decompression overhead caused by an on-line data decompression and also the fixed memory space allocation allows efficient management of the compressed blocks  ...  The results from trace-driven simulation show that the SCMS approach can provide around 35% decrease in the on-chip cache miss ratio as well as a 53% decrease in the data traffic over the conventional  ...  Introduction As the processor-memory performance gap increases every year, long memory access latencies are becoming the primary obstacle to improve the performance of computer systems.  ... 
doi:10.1109/eurmic.1999.794470 dblp:conf/euromicro/LeeHK99 fatcat:hjpagiomubclvlfns4imb5phg4
« Previous Showing results 1 — 15 out of 25,749 results