A study of three dynamic approaches to handle widely shared data in shared-memory multiprocessors

Stefanos Kaxiras, Stein Gjessing, James R. Goodman
1998 Proceedings of the 12th international conference on Supercomputing - ICS '98  
In this paper we argue that widely shared data are a more serious problem than previously recognized, and that furthermore, it is possible to provide transparent support that actually gives an advantage to accesses to widely shared data by exploiting their redundancy to improve accessibility. The GLOW extensions to cache coherence protocols -previously proposed-provide such support for widely shared data by defining functionality in the network domain. However, in their static form the GLOW
more » ... c form the GLOW extensions relied on the user to identify and expose widely shared data to the hardware. This approach suffers because: i) it requires modification of the programs, ii) it is not always possible to statically identify the widely shared data, and iii) it is incompatible with commodity hardware. To address these issues, we study three dynamic schemes to discover widely shared data at runtime. The first scheme is inspired by read-combining and is based on observing requests in the network switches -the GLOW agents. The agents intercept requests whose addresses have been observed recently. This scheme tracks closely the performance of the static GLOW while it always outperforms ordinary congestion-based readcombining. In the second scheme, the memory directory discovers widely shared data by counting the number of reads between writes. Information about the widely shared nature of data is distributed to the nodes which subsequently use special wide sharing requests to access them. Simulations confirm that this scheme works well when the widely shared nature of the data is persistent over time. The third and most significant scheme is based on predicting which load instructions are going to access widely shared data. Although the implementation of this scheme is not as straightforward in a commodity-parts environment, it outperforms all others.
doi:10.1145/277830.277943 dblp:conf/ics/KaxirasGG98 fatcat:dtow5jzvbvfyrewlzrxkjn3toq