A lightweight in-place implementation for software thread-level speculation

Cosmin E. Oancea, Alan Mycroft, Tim Harris
2009 Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures - SPAA '09  
Thread-level speculation (TLS) is a technique that allows parts of a sequential program to be executed in parallel. TLS ensures the parallel program's behaviour remains true to the language's original sequential semantics; for example, allowing multiple iterations of a loop to run in parallel if there are no conflicts between them. Conventional software-TLS algorithms detect conflicts dynamically. They suffer from a number of problems. TLS implementations can impose large storage overheads
more » ... d by buffering speculative work. TLS implementations can offer disappointing scalability, if threads can only commit speculative work back to the "real" heap sequentially. TLS implementations can be slow because speculative reads must consult look-aside tables to see earlier speculative writes, or because speculative operations replace normal reads and writes with expensive synchronisation primitives (e.g. CAS or memory fences). We present a streamlined software-TLS algorithm for mostlyparallel loops that aims to avoid these problems. We allow speculative work to be performed in place, so we avoid buffering, and so that reads naturally see earlier writes. We avoid needing a serialcommit protocol. We avoid the need for CAS or memory fences in common operations. We strive to reduce the size of TLS-related conflict-detection state, and to interact well with typical data-cache implementations. We evaluate our implementation on off-the-shelf hardware using seven applications from SciMark2, BYTEmark and JOlden. We achieve an average 77% of the speed-up of manuallyparallelized versions of the benchmarks for fully parallel loops. We achieve a maximum of a 5.8x speed-up on an 8-core machine.
doi:10.1145/1583991.1584050 dblp:conf/spaa/OanceaMH09 fatcat:j337tnmw2neyxpd5tyzlxypc2m