Efficient Instruction Schedulers for SMT Processors

J.J. Sharkey, D.V. Ponomarev
The Twelfth International Symposium on High-Performance Computer Architecture, 2006.  
We propose dynamic scheduler designs to improve the scheduler scalability and reduce its complexity in the SMT processors. Our first design is an adaptation of the recently proposed instruction packing to SMT. Instruction packing opportunistically packs two instructions (possibly from different threads), each with at most one non-ready source operand at the time of dispatch, into the same issue queue entry. Our second design, termed 2OP_BLOCK, takes these ideas one step further and completely
more » ... oids the dispatching of the instructions with two non-ready source operands. This technique has several advantages. First, it reduces the scheduling complexity (and the associated delays) as the logic needed to support the instructions with 2 non-ready source operands is eliminated. More surprisingly, 2OP_BLOCK simultaneously improves the performance as the same issue queue entry may be reallocated multiple times to the instructions with at most one non-ready source (which usually spend fewer cycles in the queue) as opposed to hogging the entry with an instruction which enters the queue with two non-ready sources. For the schedulers with the capacity to hold 64 instructions, the 2OP_BLOCK design outperforms the traditional queue by 11%, on the average, and at the same time results in a 10% reduction in the overall scheduling delay.
doi:10.1109/hpca.2006.1598137 dblp:conf/hpca/SharkeyP06 fatcat:pehckrui2bbabef7c3ms6diygi