360 Hits in 2.4 sec

Feedback mechanisms for improving probabilistic memory prefetching

Ibrahim Hur, Calvin Lin
2009 2009 IEEE 15th International Symposium on High Performance Computer Architecture  
The ASD prefetcher is a standard stream buffer that takes a probabilistic feedback-based probabilistic approach to identifying streams.  ...  This paper presents three techniques for improving the effectiveness of the recently proposed Adaptive Stream Detection (ASD) prefetching mechanism.  ...  We thank the entire IBM Power5 team and the anonymous referees for their valuable comments.  ... 
doi:10.1109/hpca.2009.4798282 dblp:conf/hpca/HurL09 fatcat:pthunqnu4bf3xkcicpqmbny5j4

Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers

Santhosh Srinath, Onur Mutlu, Hyesoon Kim, Yale N. Patt
2007 2007 IEEE 13th International Symposium on High Performance Computer Architecture  
This paper proposes a mechanism that incorporates dynamic feedback into the design of the prefetcher to increase the performance improvement provided by prefetching as well as to reduce the negative performance  ...  Using the proposed dynamic mechanism improves average performance by 6.5% on 17 memory-intensive benchmarks in the SPEC CPU2000 suite compared to the best-performing conventional stream-based data prefetcher  ...  Acknowledgments We thank Matthew Merten, Moinuddin Qureshi, members of the HPS Research Group, and the anonymous reviewers for their comments and suggestions.  ... 
doi:10.1109/hpca.2007.346185 dblp:conf/hpca/SrinathMKP07 fatcat:freouwwyfvf6ljqss2fnj7guii

Making Address-Correlated Prefetching Practical

Thomas F. Wenisch, Michael Ferdman, Anastasia Ailamaki, Babak Falsafi, Andreas Moshovos
2010 IEEE Micro  
thank the anonymous reviewers for their feedback.  ...  So, the prefetching mechanism must be designed to account for long correlation table lookup latency.  ... 
doi:10.1109/mm.2010.21 fatcat:4nnthxry2rbdvejylzswlmbcja

Practical off-chip meta-data for temporal memory streaming

Thomas F. Wenisch, Michael Ferdman, Anastasia Ailamaki, Babak Falsafi, Andreas Moshovos
2009 2009 IEEE 15th International Symposium on High Performance Computer Architecture  
Prior research demonstrates that temporal memory streaming and related address-correlating prefetchers improve performance of commercial server workloads though increased memory level parallelism.  ...  For maximum effectiveness, STMS needs 64MB of meta-data in main memory, a small fraction of memory in servers. • Latency efficiency.  ...  Acknowledgements The authors would like to thank Brian Gold and the anonymous reviewers for their feedback.  ... 
doi:10.1109/hpca.2009.4798239 dblp:conf/hpca/WenischFAFM09 fatcat:qzies3ngwjaetpsnel7mbbclkq

Data Cache Prefetching with Perceptron Learning [article]

Haoyuan Wang, Zhiwei Luo
2017 arXiv   pre-print
This mechanism boost execution performance by ideally mitigating cache pollution and eliminating redundant memory request issued by prefetcher.  ...  Though it is possible that perceptron may refuse useful blocks and thus cause minor raise in cache miss rate, lower memory request count can decrease average memory access latency, which compensate for  ...  However, IPC improvements seemingly contrary to this. We suggest that lesser pressure on memory subsystem lead to this.  ... 
arXiv:1712.00905v1 fatcat:ivqomnevvnfhhdwxj5fbe4vh3e

Band-Pass Prefetching

Aswinkumar Sridharan, Biswabandan Panda, Andre Seznec
2017 ACM Transactions on Architecture and Code Optimization (TACO)  
Therefore, prior works leave scope for performance improvement. Towards this end, we propose a solution to manage prefetching in multi-core systems.  ...  In multi-core systems, an application's prefetcher can interfere with the memory requests of other applications using the shared resources, such as last level cache and memory bandwidth.  ...  The authors thank the anonymous reviewers and the ALF/PACAP team for its valuable feedback on this work.  ... 
doi:10.1145/3090635 fatcat:aeact4t3dveetdx65yeuyfxz3e

Memory Prefetching Using Adaptive Stream Detection

Ibrahim Hur, Calvin Lin
2006 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06)  
We use this concept to design a prefetcher that resides on an on-chip memory controller.  ...  Using highly accurate simulators for the IBM Power5+, we show that this prefetcher improves performance of the SPEC2006fp benchmarks by an average of 32.7% when compared against a Power5+ that performs  ...  We thank Alper Buyuktosunoglu for his helpful expertise on power consumption. We thank Doug Burger and E Lewis for their comments on an early draft of this paper.  ... 
doi:10.1109/micro.2006.32 dblp:conf/micro/HurL06 fatcat:xcep64twvfgdfhj6jwehnhck7y

Prefetch-Aware DRAM Controllers

Chang Joo Lee, Onur Mutlu, Veynu Narasiman, Yale N. Patt
2008 2008 41st IEEE/ACM International Symposium on Microarchitecture  
Our evaluation shows that PADC significantly outperforms previous memory controllers with rigid prefetch handling policies.  ...  The key idea is to 1) adaptively prioritize between demand and prefetch requests, and 2) drop useless prefetches to free up memory system resources, based on the accuracy of the prefetcher.  ...  We also compare and incorporate PADC with other mechanisms such as hardware prefetch filtering, feedback directed prefetching, memory address remapping, and runahead execution.  ... 
doi:10.1109/micro.2008.4771791 dblp:conf/micro/LeeMNP08 fatcat:r2rmrj643jcwpkjknto2ixr2xa

Prefetch-Aware Memory Controllers

Chang Joo Lee, Onur Mutlu, Veynu Narasiman, Yale N. Patt
2011 IEEE transactions on computers  
The key idea is to 1) adaptively prioritize between demands and prefetches, and 2) drop useless prefetches to free up memory system resources, based on prefetch accuracy.  ...  Our evaluation shows that PADC significantly outperforms previous memory controllers with rigid prefetch handling policies.  ...  We also thank the anonymous reviewers for their comments. Chang Joo Lee and Veynu Narasiman were supported by IBM and NVIDIA PhD fellowships respectively during this work.  ... 
doi:10.1109/tc.2010.214 fatcat:m5bbeqxcjjdfxhcetpr46kzrpa

Access map pattern matching for data cache prefetch

Yasuo Ishii, Mary Inaba, Kei Hiraki
2009 Proceedings of the 23rd international conference on Conference on Supercomputing - ICS '09  
For example, out-of-order execution changes the memory access order.  ...  Hardware data prefetching is widely adopted to hide long memory latency.  ...  The Markov prefetcher detects probabilistic address correlation [3] .  ... 
doi:10.1145/1542275.1542349 dblp:conf/ics/IshiiIH09 fatcat:lj4udf4lpnhovol3ubmpia2gue

Automatically Generating Symbolic Prefetches for Distributed Transactional Memories [chapter]

Alokika Dash, Brian Demsky
2010 Lecture Notes in Computer Science  
We evaluate this prefetching mechanism in the context of a middleware framework for distributed transactional memory.  ...  We measured speedups due to prefetching of up to 13.31× for accessing arrays and 4.54× for accessing linked lists.  ...  We would like to thank Brad Chamberlain for feedback on our paper and the anonymous reviewers for their helpful comments.  ... 
doi:10.1007/978-3-642-16955-7_18 fatcat:n3nsbvgzb5h6xoykvya35jbuvu

Probabilistic Directed Writebacks for Exclusive Caches

Lena E. Olson, Mark D. Hill
2016 SIGARCH Computer Architecture News  
This approach yields a large reduction in number of LLC writebacks: 25% fewer for SPEC on average, 80% fewer for graph500, and 67% fewer for an in-memory hash table. modification for a single PC.  ...  We adapt Flajolet and Martin's probabilistic counting to keep the state small: two additional bits per L1D block, with an additional 6KB prediction table.  ...  for improving LLC behavior.  ... 
doi:10.1145/2971331.2971334 fatcat:tpsk5t5kqjdtzidheugn7goxfy

A neural network memory prefetcher using semantic locality [article]

Leeor Peled, Uri Weiser, Yoav Etsion
2018 arXiv   pre-print
Accurate memory prefetching is paramount for processor performance, and modern processors employ various techniques to identify and prefetch different memory access patterns.  ...  We believe that this line of research can further improve the efficiency of such neural networks and allow harnessing them for additional micro-architectural predictions.  ...  Prediction feedback The neural network must be able to accept feedback for training, both positive and negative.  ... 
arXiv:1804.00478v2 fatcat:rl2ahxusrbdntby4mx7r6yxoie

Efficient emulation of hardware prefetchers via event-driven helper threading

Ilya Ganusov, Martin Burtscher
2006 Proceedings of the 15th international conference on Parallel architectures and compilation techniques - PACT '06  
Furthermore, we demonstrate that running event-driven prefetching threads on top of a baseline with a hardware stride prefetcher yields significant speedups for many programs.  ...  prefetching techniques.  ...  Unlike previously proposed approaches for software prefetching, our EDHT mechanism can improve performance without the need to modify or analyze the original binary.  ... 
doi:10.1145/1152154.1152178 dblp:conf/IEEEpact/GanusovB06 fatcat:xbd5prrckjf3vlymoeipmmdcv4

Tolerating latency through software-controlled prefetching in shared-memory multiprocessors

Todd Mowry, Anoop Gupta
1991 Journal of Parallel and Distributed Computing  
Software-controlled prefetching is a technique for tolerating memory latency by explicitly executing prefetch instructions to move data close to the processor before it is actually needed.  ...  It also works for both uniprocessor and large-scale shared-memory multiprocessor architectures.  ...  The two types of feedback that may be useful for prefetching are control-flow feedback and memory behavior feedback, which we will briefly discuss.  ... 
doi:10.1016/0743-7315(91)90014-z fatcat:shfq6duw5jgpbfgyx5xfjobzo4
« Previous Showing results 1 — 15 out of 360 results