Self-Optimizing Memory Controllers

Engin Ipek, Onur Mutlu, José F. Martínez, Rich Caruana
2008 SIGARCH Computer Architecture News  
Efficiently utilizing off-chip DRAM bandwidth is a critical issue in designing cost-effective, high-performance chip multiprocessors (CMPs). Conventional memory controllers deliver relatively low performance in part because they often employ fixed, rigid access scheduling policies designed for average-case application behavior. As a result, they cannot learn and optimize the long-term performance impact of their scheduling decisions, and cannot adapt their scheduling policies to dynamic
more » ... behavior. We propose a new, self-optimizing memory controller design that operates using the principles of reinforcement learning (RL) to overcome these limitations. Our RL-based memory controller observes the system state and estimates the long-term performance impact of each action it can take. In this way, the controller learns to optimize its scheduling policy on the fly to maximize long-term performance. Our results show that an RL-based memory controller improves the performance of a set of parallel applications run on a 4-core CMP by 19% on average (up to 33%), and it improves DRAM bandwidth utilization by 22% compared to a state-of-the-art controller. Key idea: We propose to design the memory controller as an RL agent whose goal is to learn automatically an optimal memory scheduling policy via interaction with the rest of the system. An RL-based memory controller takes as input parts of the system state and considers the long-term performance impact of each action it can take. The controller's job is to (1) associate system states and actions with long-term reward values, (2) take the action (i.e., schedule the command) that is estimated to proand Write (tWL) latencies, as well as bus contention, are included in the memory access latency.
doi:10.1145/1394608.1382172 fatcat:rqvs4uqidfdqhkjfzpbzpvtdqm