Efficient dynamic scheduling through tag elimination

Dan Ernst, Todd Austin
2002 SIGARCH Computer Architecture News  
An increasingly large portion of scheduler latency is derived from the monolithic content addressable memory (CAM) arrays accessed during instruction wakeup. The performance of the scheduler can be improved by decreasing the number of tag comparisons necessary to schedule instructions. Using detailed simulation-based analyses, we find that most instructions enter the window with at least one of their input operands already available. By putting these instructions into specialized windows with
more » ... wer tag comparators, load capacitance on the scheduler critical path can be reduced, with only very small effects on program throughput. For instructions with multiple unavailable operands, we introduce a last-tag speculation mechanism that eliminates all remaining tag comparators except those for the last arriving input operand. By combining these two tag-reduction schemes, we are able to construct dynamic schedulers with approximately one quarter of the tag comparators found in conventional designs. Conservative circuit-level timing analyses indicate that the optimized designs are 20-45% faster and require 10-20% less power, depending on instruction window size.
doi:10.1145/545214.545221 fatcat:ofgyx745knamjgtmwrzuo2r2wi