A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2001; you can also visit <a rel="external noopener" href="http://www.lcs.mit.edu:80/publications/pubs/pdf/MIT-LCS-TM-572.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
<i title="Association for Computing Machinery (ACM)">
<a target="_blank" rel="noopener" href="https://fatcat.wiki/container/xu5bk2lj5rbdxlx6222nw7tsxi" style="color: black;">SIGPLAN notices</a>
Advances in VLSI technology will enable chips with over a billion transistors within the next decade. Unfortunately, the centralized-resource architectures of modern microprocessors are illsuited to exploit such advances. Achieving a high level of parallelism at a reasonable clock speed requires distributing the processor resources -a trend already visible in the dual-register-file architecture of the Alpha 21264. A Raw microprocessor takes an extreme position in this space by distributing all<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/291006.291018">doi:10.1145/291006.291018</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/hna7zwpzkzfovkfg2tn7lugw4m">fatcat:hna7zwpzkzfovkfg2tn7lugw4m</a> </span>
more »... ts resources such as register files, memory ports, and ALUs over a pipelined twodimensional interconnect, and exposing them fully to the compiler. Compilation for instruction-level parallelism (ILP) on such distributed-resource machines requires both spatial instruction scheduling and traditional temporal instruction scheduling. The compiler must also orchestrate data memories to take advantage of the on-chip distributed-memory bandwidth. This paper describes the techniques used by the Raw compiler to handle these issues. Preliminary results from a SUIF-based compiler for sequential programs written in C and Fortran indicate that the Raw approach to exploiting ILP can achieve speedups scalable with the number of processors for applications with parallelism within a basic block. Though research is still ongoing, these results offer positive indications that Raw may provide competitive performance against existing superscalars for applications with small amounts of parallelism, while achieving significantly better performance for applications with a large amount of ILP or coarse-grain parallelism.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20011112163445/http://www.lcs.mit.edu:80/publications/pubs/pdf/MIT-LCS-TM-572.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/9d/e2/9de2cd18c1a4d5f2ea86d64cae371e42c699ade0.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/291006.291018"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>