Software Barrier Performance on Dual Quad-Core Opterons

Jie Chen, William Watson III
2008 2008 International Conference on Networking, Architecture, and Storage  
Multi-core processors based SMP servers have become building blocks for Linux clusters in recent years because they can deliver better performance for multi-threaded programs through on-chip multi-threading. However, a relative slow software barrier can hinder the performance of a data-parallel scientific application on a multi-core system. In this paper we study the performance of different software barrier algorithms on a server based on newly introduced AMD quad-core Opteron processors. We
more » ... on processors. We study how the memory architecture and the cache coherence protocol of the system influence the performance of barrier algorithms. We present an optimized barrier algorithm derived from the queue-based barrier algorithm. We find that the optimized barrier algorithm achieves speedup of 1.77 over the original queue-based algorithm. In addition, it has speedup of 2.39 over the software barrier generated by the Intel OpenMP compiler.
doi:10.1109/nas.2008.27 dblp:conf/iwnas/ChenW08 fatcat:t22byftaj5eq3k7xoobrlf34ty