Multigrain shared memory
ACM Transactions on Computer Systems
Parallel workstations, each comprising tens of processors based on shared memory, promise cost-e ective scalable multiprocessing. This paper explores the coupling of such small-to medium-scale shared memory multiprocessors through software over a local area network to synthesize larger shared memory systems. We call these systems Distributed Shared-memory MultiProcessors (DSMPs). This paper introduces the design of a shared memory system that uses multiple granularities of sharing, called MGS,
... nd presents a prototype implementation of MGS on the MIT Alewife multiprocessor. Multigrain shared memory enables the collaboration of hardware and software shared memory, thus synthesizing a single transparent shared memory address space across a cluster of multiprocessors. The system leverages the e cient support for ne-grain cache-line sharing within multiprocessor nodes as often as possible, and resorts to coarse-grain page-level sharing across nodes only when absolutely necessary. Using our prototype implementation of MGS, an in-depth study of several shared memory applications is conducted to understand the behavior of DSMPs. Our study is the rst to comprehensively explore the DSMP design space, and to compare the performance of DSMPs against all-software and all-hardware DSMs on a single experimental platform. Keeping the total number of processors xed, we show that applications execute up to 85% faster on a DSMP as compared to an all-software DSM. We a l s o s h o w that all-hardware DSMs hold a signi cant performance advantage over DSMPs on challenging applications, between 159% and 1014%. However, program transformations to improve data locality for these applications allow DSMPs to almost match the performance of an all-hardware multiprocessor of the same size.