Parallel cache-efficient code for computing the McCaskill partition functions

Marek Pałkowski, Włodzimierz Bielecki
2019 Proceedings of the 2019 Federated Conference on Computer Science and Information Systems  
We present parallel tiled optimized McCaskill's partition functions computation code. That CPU and memory intensive dynamic programming task is within computational biology. To optimize code, we use the authorial source-to-source TRACO compiler and compare obtained code performance to that generated with the state-of-the-art PluTo compiler based on the affine transformations framework (ATF). Although PLuTo generates tiled code with outstanding locality, it fails to parallelize tiled code. A
more » ... O tiling strategy uses the transitive closure of a dependence graph to avoid affine function calculation. The ISL scheduler is used to parallelize tiled loop nests. An experimental study carried out on a multi-core computer demonstrates considerable speed-up of generated code for the larger number of threads.
doi:10.15439/2019f8 dblp:conf/fedcsis/PalkowskiB19 fatcat:ng45r6gvrrcdpgzbre6vqa46ti