Boosting single-thread performance in multi-core systems through fine-grain multi-threading

Carlos Madriles, Pedro López, Josep M. Codina, Enric Gibert, Fernando Latorre, Alejandro Martinez, Raúl Martinez, Antonio Gonzalez
2009 Proceedings of the 36th annual international symposium on Computer architecture - ISCA '09  
Industry has shifted towards multi-core designs as we have hit the memory and power walls. However, single thread performance remains of paramount importance since some applications have limited thread-level parallelism (TLP), and even a small part with limited TLP impose important constraints to the global performance, as explained by Amdahl's law. In this paper we propose a novel approach for leveraging multiple cores to improve single-thread performance in a multi-core design. The proposed
more » ... chnique features a set of novel hardware mechanisms that support the execution of threads generated at compile time. These threads result from a fine-grain speculative decomposition of the original application and they are executed under a modified multi-core system that includes: (1) mechanisms to support multiple versions; (2) mechanisms to detect violations among threads; (3) mechanisms to reconstruct the original sequential order; and (4) mechanisms to checkpoint the architectural state and recovery to handle misspeculations. The proposed scheme outperforms previous hardware-only schemes to implement the idea of combining cores for executing single-thread applications in a multi-core design by more than 10% on average on Spec2006 for all configurations. Moreover, single-thread performance is improved by 41% on average when the proposed scheme is used on a Tiny Core, and up to 2.6x for some selected applications.
doi:10.1145/1555754.1555813 dblp:conf/isca/MadrilesLCGLMMG09 fatcat:lsay4dj6gjbl7l6w2jyg566tcu