Scalable queue-based spin locks with timeout
Queue-based spin locks allow programs with busy-wait synchronization to scale to very large multiprocessors, without fear of starvation or performance-destroying contention. Socalled try locks, traditionally based on non-scalable test-andset locks, allow a process to abandon its attempt to acquire a lock after a given amount of time. The process can then pursue an alternative code path, or yield the processor to some other process. We demonstrate that it is possible to obtain both scalability
... both scalability and bounded waiting, using variants of the queuebased locks of Craig, Landin, and Hagersten, and of Mellor-Crummey and Scott. A process that decides to stop waiting for one of these new locks can "link itself out of line" atomically. Single-processor experiments reveal performance penalties of 50-100% for the CLH and MCS try locks in comparison to their standard versions; this marginal cost decreases with larger numbers of processors. We have also compared our queue-based locks to a traditional test-and-test and set lock with exponential backoff and timeout. At modest (non-zero) levels of contention, the queued locks sacrifice cache locality for fairness, resulting in a worst-case 3X performance penalty. At high levels of contention, however, they display a 1.5-2X performance advantage, with significantly more regular timings and significantly higher rates of acquisition prior to timeout.