Filters








1,857 Hits in 8.7 sec

Design and performance of directory caches for scalable shared memory multiprocessors

M.M. Michael, A.K. Nanda
1999 Proceedings Fifth International Symposium on High-Performance Computer Architecture  
Recent research shows that the occupancy of the coherence controllers is a major performance bottleneck for distributed cache coherent shared memory multiprocessors.  ...  The results also show the performance advantage of multientry directory cache lines, as a result of spatial locality and the absence of sharing of directories.  ...  shared memory multiprocessors.  ... 
doi:10.1109/hpca.1999.744354 dblp:conf/hpca/MichaelN99 fatcat:iv7byhxdzvaszlzrno77m24nxu

Scalable directory architecture for distributed shared memory chip multiprocessors

Huan Fang, Mats Brorsson
2009 SIGARCH Computer Architecture News  
.  Micro Architecture How do we manage on-chip resources like caches, memory controllers and routers.  Interconnection A more scalable on-chip network is needed rather than global broadcast technique  ...  for future CMPs.  Scalability A system whose performance improves after adding hardware, proportionally to the capacity added, is said to be a scalable system.  ...   Ruby is a timing simulator of a multiprocessor memory system that models: caches, cache controllers, system interconnect, memory controllers, and banks of main memory.  ... 
doi:10.1145/1556444.1556452 fatcat:milb6jmo2rbzzfswjfkdfyqe2q

Scalable Shared-Memory Multiprocessing [Book Reviews]

J. Zalewski
1996 IEEE Parallel & Distributed Technology Systems & Applications  
The major finding was that reducing memory latency due to shared data is crucial for achieving good performance of parallel applications.  ...  Those systems are grouped into six categories: directory-based, hierarchical, reflective memory, non-cache-coherent, vector, and virtual shared-memory systems.  ...  Those systems are grouped into six categories: directory-based, hierarchical, reflective memory, non-cache-coherent, vector, and virtual shared-memory systems.  ... 
doi:10.1109/m-pdt.1996.494608 fatcat:bra56vkmozautepaatsjdl3xny

Cache-coherent distributed shared memory: perspectives on its development and future challenges

J. Hennessy, M. Heinrich, A. Gupta
1999 Proceedings of the IEEE  
We review the key developments that led to the creation of cache-coherent distributed shared memory and describe the Stanford DASH Multiprocessor, the first working implementation of hardware-supported  ...  Hardware-supported distributed shared memory is becoming the dominant approach for building multiprocessors with moderate to large numbers of processors.  ...  What are the best alternatives for designing scalable, cache-coherent multiprocessors?  ... 
doi:10.1109/5.747863 fatcat:koqfmkqdibaylcxfiheb33bwly

Scalable shared-memory multiprocessor architectures

S. Thakkar, M. Dubois, A.T. Laundrie, G.S. Sohi
1990 Computer  
Acknowledgment We would like to thank all the authors of the special reports that follow for their assistance and for their review of this introduction.  ...  Goosen and David R. Cheriton. "Predicting the Performance of Shared Multiprocessor Caches," Proc. Cache and Inrerconnect Workshop, M. Dubois and S.  ...  Because of the efficiency and ease of the shared-memory programming model, these machines are more popular for parallel programming than distributed multiprocessors such as NCube or Intel's iPSC.  ... 
doi:10.1109/2.55502 fatcat:bi7hzoeqarbsvjrj2ldtdig5ry

A Novel Lightweight Directory Architecture for Scalable Shared-Memory Multiprocessors [chapter]

Alberto Ros, Manuel E. Acacio, José M. García
2005 Lecture Notes in Computer Science  
There are two important hurdles that restrict the scalability of directory-based shared-memory multiprocessors: the directory memory overhead and the long L2 miss latencies due to the indirection introduced  ...  The lightweight directory architecture removes the directory structure from main memory and it stores directory information in the L2 cache avoiding in most cases the access to main memory.  ...  Acknowledgments This work has been supported by the Spanish Ministry of Ciencia y Tecnología and the European Union (Feder Funds) under grant TIC2003-08154-C06-03.  ... 
doi:10.1007/11549468_65 fatcat:t67l6qxu4zealpmrie7lxrapfm

A two-level directory architecture for highly scalable cc-NUMA multiprocessors

M.E. Acacio, J. Gonzalez, J.M. Garcia, J. Duato
2005 IEEE Transactions on Parallel and Distributed Systems  
One important issue the designer of a scalable shared-memory multiprocessor must deal with is the amount of extra memory required to store the directory information.  ...  This work presents a scalable directory architecture that significantly reduces the size of the directory for large-scale configurations of a multiprocessor without degrading performance.  ...  ACKNOWLEDGMENTS The authors would like to thank the anonymous referees for their detailed comments and valuable suggestions, which have helped to improve the quality of the paper.  ... 
doi:10.1109/tpds.2005.4 fatcat:3hcdkjqiwjh55in3d4uph6gdfq

Shared Memory Multiprocessors [chapter]

2004 Parallel Computing on Heterogeneous Networks  
The goal of this report in to give an overview of issues and tradeoffs involved in memory hierarchy design for shared memory multiprocessors.  ...  Therefore, the directory size grows linearly with the memory size and the number of nodes in the system and can become very large for large-scale multiprocessors.  ... 
doi:10.1002/0471654167.ch3 fatcat:dvaj7kmetfgr7bkmdrmvzljwda

A. novel approach to reduce L2 miss latency in shared-memory multiprocessors

M. E. Acacio, J. Gonzalez, J. M. Garcia, J. Duato
2002 Proceedings 16th International Parallel and Distributed Processing Symposium  
Our proposal replaces the traditional directory with a novel threelevel directory architecture and adds a small shared data cache to each of the nodes of a multiprocessor system.  ...  Due to their small size, the first-level directory and the shared data cache are integrated into the processor chip in every node.  ...  Acknowledgments This research has been carried out using the resources of the Centre de Computació i Comunicacions de Catalunya (CESCA-CEPBA) as well as the SGI Origin 2000 of the Universitat de Valencia  ... 
doi:10.1109/ipdps.2002.1015554 dblp:conf/ipps/AcacioGGD02 fatcat:s4zmex6ezvg5vnghe4py3p5btu

An efficient cache design for scalable glueless shared-memory multiprocessors

Alberto Ros, Manuel E. Acacio, José M. García
2006 Proceedings of the 3rd conference on Computing frontiers - CF '06  
In this work, we propose a novel design for the L2 cache level, at which coherence has to be maintained, aimed at being used in glueless shared-memory multiprocessors.  ...  Traditionally, cache coherence in large-scale shared-memory multiprocessors has been ensured by means of a distributed directory structure stored in main memory.  ...  scalable shared-memory multiprocessors.  ... 
doi:10.1145/1128022.1128065 dblp:conf/cf/RosAG06 fatcat:2fvv7rqjljfindrcopxyj2ntpy

Directory-based cache coherence in large-scale multiprocessors

D. Chaiken, C. Fields, K. Kurihara, A. Agarwal
1990 Computer  
We would also like to thank the rest of the Alewife group for putting up with our interminable trace-driven simulations. The research reported in this article is funded by DARPA cuntract No.  ...  Kirk Johnson, who wrote and tr'aced the Speech application, is responsible for the read-only data processing results.  ...  For each memory reference in a trace, our cache and directory simulator determines the effects on the state of the corresponding block in the cache and the shared memory.  ... 
doi:10.1109/2.55500 fatcat:b3ybdsfjbjabrkc66pjvpkjuse

Towards hierarchical cluster based cache coherence for large-scale network-on-chip

Yuang Zhang, Zhonghai Lu, Axel Jantsch, Li Li, Minglun Gao
2009 2009 4th International Conference on Design & Technology of Integrated Systems in Nanoscal Era  
We introduce a novel hierarchical cluster based cache coherence scheme for large-scale NoC based distributed memory architectures. We describe the hierarchical memory organization.  ...  We show analytically that the proposed scheme has better performance than traditional counterparts both in memory overhead and communication cost.  ...  Section 4 discusses the performance and scalability of the proposed cache coherence scheme.  ... 
doi:10.1109/dtis.2009.4938037 fatcat:urthsdwznnhh3c4rhg7tnqku3a

Two proposals for the inclusion of directory information in the last-level private caches of glueless shared-memory multiprocessors

Alberto Ros, Ricardo Fernández-Pascual, Manuel E. Acacio, José M. García
2008 Journal of Parallel and Distributed Computing  
In this work, we propose two alternative designs for the last-level private cache of glueless shared-memory multiprocessors: the lightweight directory and the SGluM cache.  ...  In glueless shared-memory multiprocessors where cache coherence is usually maintained using a directory-based protocol, the fast access to the on-chip components (caches and network router, among others  ...  Acknowledgments The authors would like to thank the anonymous referees for their detailed comments and valuable suggestions, which have helped to improve the quality of the paper.  ... 
doi:10.1016/j.jpdc.2008.07.001 fatcat:2ridrw3lsrajjn2hy2vuprlcge

Proximity-aware directory-based coherence for multi-core processor architectures

Jeffery A. Brown, Rakesh Kumar, Dean Tullsen
2007 Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures - SPAA '07  
In this paper, we discuss implementations of coherence for CMPs and propose and evaluate a novel directory-based coherence scheme to improve the performance of parallel programs on such processors.  ...  As the number of cores increases on chip multiprocessors, coherence is fast becoming a central issue for multi-core performance.  ...  Acknowledgments The authors would like to thank the anonymous reviewers for their helpful insights.  ... 
doi:10.1145/1248377.1248398 dblp:conf/spaa/BrownKT07 fatcat:y7c3zgv3dncirjggikdfmyfuwi

An architecture for high-performance scalable shared-memory multiprocessors exploiting on-chip integration

M.E. Acacio, J. Gonzalez, J.M. Garcia, J. Duato
2004 IEEE Transactions on Parallel and Distributed Systems  
Index Terms-cc-NUMA multiprocessor, directory memory overhead, L2 miss latency, three-level directory, shared data cache, onprocessor-chip integration.  ...  Due to their small size, the first-level directory and the shared data cache are integrated into the processor chip in every node, which enhances performance by saving accesses to the slower main memory  ...  ACKNOWLEDGMENTS The authors would like to thank the anonymous referees for their detailed comments and valuable suggestions, which have helped to improve the quality of the paper.  ... 
doi:10.1109/tpds.2004.27 fatcat:viz57lpgdncf5hekgof3cu62dq
« Previous Showing results 1 — 15 out of 1,857 results