A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit <a rel="external noopener" href="http://iacoma.cs.uiuc.edu/iacoma-papers/encyclopedia_coma.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
Cache-only memory architectures
<span title="">1999</span>
<i title="Institute of Electrical and Electronics Engineers (IEEE)">
<a target="_blank" rel="noopener" href="https://fatcat.wiki/container/dsrvu6bllzai7oj3hktnc5yf4q" style="color: black;">Computer</a>
</i>
Synonyms COMA Definition A Cache-Only Memory Architecture (COMA) is a type of cache-coherent nonuniform memory access (CC-NUMA) architecture. Unlike in a conventional CC-NUMA architecture, in a COMA, every shared- memory module in the machine is a cache, where each memory line has a tag with the line's address and state. As a processor references a line, it transparently brings it to both its private cache(s) and its nearby portion of the NUMA shared memory (Local
<span class="external-identifiers">
<a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/2.769448">doi:10.1109/2.769448</a>
<a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ozadfmvoyne5jczmnm42x4ukza">fatcat:ozadfmvoyne5jczmnm42x4ukza</a>
</span>
more »
... mory) -possibly displacing a valid line from its local memory. Effectively, each shared-memory module acts as a huge cache mem- ory, giving the name COMA to the architecture. Since the COMA hardware automatically replicates the data and migrates it to the memory module of the node that is currently accessing it, COMA increases the chances of data being available locally. This reduces the possibility of frequent long-latency memory accesses. Effectively, COMA dynamically adapts the shared data layout to the application's reference patterns. Discussion Basic Concepts In a conventional CC-NUMA architecture, each node contains one or more processors with private caches and a memory module that is part of the NUMA shared memory. A page allocated in the memory module of one node can be accessed by the processors of all other nodes. The physical page number of the page speci- fies the node where the page is allocated. Such node is referred to as the Home Node of the page. The physi- cal address of a memory line includes the physical page number and the offset within that page. In large machines, fetching a line from a remote memory module can take several times longer than fetching it from the local memory module. Conse- quently, for an application to attain high performance, the local memory module must satisfy a large fraction of the cache misses. This requires a good placement of the program pages across the different nodes. If the pro- gram's memory access patterns are too complicated for the software to understand, individual data structures may not end up being placed in the memory module of the node that access them the most. In addition, when a page contains data structures that are read and written by different processors, it is hard to attain a good page placement. In a COMA, the hardware can transparently elimi- nate a certain class of remote memory accesses. COMA does this by turning memory modules into large caches called Attraction Memory (AM). When a processor requests a line from a remote memory, the line is inserted in both the processor's cache and the node's AM. A line can be evicted from an AM if another line needs the space. Ideally, with this support, the proces- sor dynamically attracts its working set into its local memory module. The lines the processor is not access- ing overflow and are sent to other memories. Because a large AM is more capable of containing a node's current working set than a cache is, more of the cache misses are satisfied locally within the node. There are three issues that need to be addressed in COMA, namely finding a line, replacing a line, and deal- ing with the memory overhead. In the rest of this article, these issues are described first, then different COMA designs are outlined, and finally further readings are suggested. David Padua (ed.), Encyclopedia of Parallel Computing, DOI ./----,
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170922224653/http://iacoma.cs.uiuc.edu/iacoma-papers/encyclopedia_coma.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext">
<button class="ui simple right pointing dropdown compact black labeled icon button serp-button">
<i class="icon ia-icon"></i>
Web Archive
[PDF]
<div class="menu fulltext-thumbnail">
<img src="https://blobs.fatcat.wiki/thumbnail/pdf/fd/92/fd925e6d18c4bebfa147a520f6b109b9c2a6b4ea.180px.jpg" alt="fulltext thumbnail" loading="lazy">
</div>
</button>
</a>
<a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/2.769448">
<button class="ui left aligned compact blue labeled icon button serp-button">
<i class="external alternate icon"></i>
ieee.com
</button>
</a>