Filters








1,265 Hits in 4.6 sec

False sharing and spatial locality in multiprocessor caches

J. Torrellas, H.S. Lam, J.L. Hennessy
1994 IEEE transactions on computers  
To mitigate false sharing and to enhance spatial locality, we optimize the layout of shared data in cache blocks in a programmer-transparent manner.  ...  While the analysis of six applications in this paper confirms that false sharing has a significant impact on the miss rate, the measurements also show that poor spatial locality among accesses to shared  ...  ,m&&v I Toul n c-16a4w TORRELLAS ef 01.: FALSE SHARING AND SPATIAL LOCALITY IN MULTIPROCESSOR CACHES Shared DIU S p r c Imrc (Kbyles) 16Roc. 16Roc. 32Roc. 32Roc.4-Word 16-Word 4-Word IbWard n 1.7 I I  ... 
doi:10.1109/12.286299 fatcat:6ht3xsqbxncgxfblkh3r7q257i

Restructuring parallel loops to curb false sharing on multicore architectures

Santosh Sarangkar, Apan Qasem
2010 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)  
Preliminary evaluation on a dual-core and a quad-core platform shows that our strategy can be effective in reducing cache interference for multi-threaded applications that exhibit inter-core spatial locality  ...  This paper addresses the issue of cache interference that occurs when concurrent threads access data that reside on a shared cache block.  ...  HARMFUL SPATIAL LOCALITY The effects of harmful spatial locality has been studied extensively in the context of shared-memory multiprocessor systems.  ... 
doi:10.1109/ipdpsw.2010.5470764 dblp:conf/ipps/SarangkarQ10 fatcat:6t5w2qiadncx7lsu6aa4xhgd2a

Cache memory behavior of advanced PDE solvers [chapter]

D. Wallin, H. Johansson, S. Holmgren
2004 Advances in Parallel Computing  
These programs take advantage of spatial locality and the amount of false sharing is limited.  ...  Unfortunately, such prefetchers often lead to additional address snoops in multiprocessor caches.  ...  Acknowledgement We would like to thank Jim Nilsson for providing us with the original version of the cache coherence protocol model for Simics.  ... 
doi:10.1016/s0927-5452(04)80061-3 fatcat:me22hdu4xbc5pojhrg5zs7dcxa

Reducing false sharing on shared memory multiprocessors through compile time data transformations

Tor E. Jeremiassen, Susan J. Eggers
1995 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming - PPOPP '95  
We show through simulation that our analysis successfully identi es the data structures that are responsible for most false sharing misses, and then transforms them without unduly decreasing spatial locality  ...  The reduction in false sharing positively impacts both execution time and program scalability when executed on a KSR2.  ...  Singh and Josep Torrellas for providing us with many of the programs in our workload, Craig Chambers, Kathryn McKinley, Anne Rogers, Jean-Loup Baer and Dean Tullsen for helpful comments on an earlier draft  ... 
doi:10.1145/209936.209955 dblp:conf/ppopp/JeremiassenE95 fatcat:yhe746shrnfonmr5yrxiud5sgu

Reducing false sharing on shared memory multiprocessors through compile time data transformations

Tor E. Jeremiassen, Susan J. Eggers
1995 SIGPLAN notices  
We show through simulation that our analysis successfully identi es the data structures that are responsible for most false sharing misses, and then transforms them without unduly decreasing spatial locality  ...  The reduction in false sharing positively impacts both execution time and program scalability when executed on a KSR2.  ...  Singh and Josep Torrellas for providing us with many of the programs in our workload, Craig Chambers, Kathryn McKinley, Anne Rogers, Jean-Loup Baer and Dean Tullsen for helpful comments on an earlier draft  ... 
doi:10.1145/209937.209955 fatcat:uf2xx3wdl5ejlh7oqdtgava7la

Sequential hardware prefetching in shared-memory multiprocessors

F. Dahlgren, M. Dubois, P. Stenstrom
1995 IEEE Transactions on Parallel and Distributed Systems  
Sequential prefetching is a simple hardware-controlled prefetching technique which relies on the automatic prefetch of consecutive blocks following the block that misses in the cache, thus exploiting spatial  ...  To offset the effect of read miss penalties on processor utilization in shared-memory multiprocessors, several software-and hardware-based data prefetching schemes have been proposed.  ...  This research has been supported in part by the Swedish National Board for Industrial and Technical Development under contract number 9001797 and by the National Science Foundation under Grant No.  ... 
doi:10.1109/71.395402 fatcat:ag2u4cppb5fuzgtazzf7qg43sq

Eliminating invalidation in coherent-cache parallel graph reduction [chapter]

Andrew J. Bennett, Paul H. J. Kelly
1994 Lecture Notes in Computer Science  
a shared-memory multiprocessor.  ...  Parallel functional programs based on the graph reduction execution model display considerable locality of reference, favouring the use of large cache lines in the implementation of the shared heap on  ...  Exploiting Spatial Locality In this section the advantage gained by exploiting spatial locality is assessed in detail.  ... 
doi:10.1007/3-540-58184-7_116 fatcat:uhbfkfuwmreuho4l7pzff5hmxq

Improving cache locality by a combination of loop and data transformations

M. Kandemir, J. Ramanujam, A. Choudhary
1999 IEEE transactions on computers  
This paper describes a compiler algorithm for optimizing cache locality in scientific codes on uniprocessor and multiprocessor machines.  ...  An important special case is one in which data layouts of some arrays are fixed and cannot be changed.  ...  ACKNOWLEDGMENTS This work is supported in part by U.S. National Science Foundation (NSF) Young Investigator Award CCR-9357840, NSF CCR-9509143. The work of J.  ... 
doi:10.1109/12.752657 fatcat:fubf75cmcjbvvlbxfcunue7qlq

Boosting the performance of shared memory multiprocessors

P. Stenstrom, M. Brorsson, F. Dahlgren, H. Grahn, M. Dubois
1997 Computer  
An emerging class of shared memory multiprocessors-nonuniform memory access machines with private caches and a cache coherence (CC) protocol-use a directory-based write-invalidate scheme.  ...  Proposed hardware optimizations to CC-NUMA machines-shared memory multiprocessors that use cache consistency protocols-can shorten the time processors lose because of cache misses and invalidations.  ...  Acknowledgments This research was supported in part by the Swedish National Board for Industrial and Technical Development (NUTEK) under Contract 9001797 and by the US National Science Foundation under  ... 
doi:10.1109/2.596630 fatcat:igee7gkc2vhk7oyrjstlwzvz3q

On the value locality of store instructions

Kevin M. Lepak, Mikko H. Lipasti
2000 Proceedings of the 27th annual international symposium on Computer architecture - ISCA '00  
false sharing based on these observations, and suggests new techniques for aligning cache coherence protocols and microarchitectural store handling techniques to exploit the value locality of stores.  ...  Value locality, a recently discovered program attribute that describes the likelihood of the recurrence of previously-seen program values, has been studied enthusiastically in the recent published literature  ...  O New Definitions of False Sharing We shift our focus to multiprocessor applications of store value locality by introducing new definitions of false sharing.  ... 
doi:10.1145/339647.339678 fatcat:6qnrtsb4ajbaxhmgnxvsyi3evu

On the value locality of store instructions

Kevin M. Lepak, Mikko H. Lipasti
2000 SIGARCH Computer Architecture News  
false sharing based on these observations, and suggests new techniques for aligning cache coherence protocols and microarchitectural store handling techniques to exploit the value locality of stores.  ...  Value locality, a recently discovered program attribute that describes the likelihood of the recurrence of previously-seen program values, has been studied enthusiastically in the recent published literature  ...  O New Definitions of False Sharing We shift our focus to multiprocessor applications of store value locality by introducing new definitions of false sharing.  ... 
doi:10.1145/342001.339678 fatcat:feikpzyipjbqvdpihjcjc3utq4

Silent stores and store value locality

K.M. Lepak, G.B. Bell, M.H. Lipasti
2001 IEEE transactions on computers  
false sharing based on these observations, and suggests new techniques for aligning cache coherence protocols and microarchitectural store handling techniques to exploit the value locality of stores.  ...  AbstractÐValue locality, a recently discovered program attribute that describes the likelihood of the recurrence of previously seen program values, has been studied enthusiastically in the recent published  ...  NEW DEFINITIONS OF FALSE SHARING We shift our focus to multiprocessor applications of store value locality by introducing new definitions of false sharing.  ... 
doi:10.1109/12.966493 fatcat:myj7oi5govcjxcxtkl5vib5jai

Improving parallel shear-warp volume rendering on shared address space multiprocessors

Dongming Jiang, Jaswinder Pal Singh
1997 SIGPLAN notices  
The results demonstrate that real time volume rendering is promising on general purpose multiprocessors, and illustrate the utility of tool hierarchies in conjunction with algorithmic and application knowledge  ...  distributed memory machines to networks of computers connected by page-based shared virtual memory.  ...  We cannot determine whether the misses are due to inherent (or false) sharing of data or due to capacity, conflict or cold misses, or even whether it is spatial or temporal locality that is causing problems  ... 
doi:10.1145/263767.263798 fatcat:2wge6pxkr5fbvgqlbd3sey4r4e

Reducing false sharing and improving spatial locality in a unified compilation framework

M. Kandemir, A. Choudhary, J. Ramanujam, P. Banerjee
2003 IEEE Transactions on Parallel and Distributed Systems  
Index Terms-Data reuse, cache locality, false sharing, loop and memory layout transformations, shared-memory multiprocessors.  ...  Large coherence units are helpful in exploiting spatial locality, but worsen the effects of false sharing.  ...  The material presented in this paper is based on research supported in part by the US National Science Foundation grants CCR-9357840 and CCR-9509143, and the Air Force Materials Command under contract  ... 
doi:10.1109/tpds.2003.1195407 fatcat:os4br76rkbfmphyja2uimnbuoa

A compiler algorithm for optimizing locality in loop nests

M. Kandemir, J. Ramanujam, A. Choudhary
1997 Proceedings of the 11th international conference on Supercomputing - ICS '97  
This paper describe8 an algorithm to optimize cache locality in scientific codes on uniprocessor and multiprocessor machines.  ...  Compiler Optimizations for Cache Locality and Coherence. Technical Report 504,  ...  Impact on False Sharing In shared-memory multiprocessors when processors make references to d&rent data items within the same cache line, false shoring occurs [12] .  ... 
doi:10.1145/263580.263650 dblp:conf/ics/KandemirRC97 fatcat:6asaplnfdvajpldvhuiczgpywm
« Previous Showing results 1 — 15 out of 1,265 results