A case study of high-throughput biological data processing on parallel platforms

D. Pekurovsky, I. N. Shindyalov, P. E. Bourne
2004 Bioinformatics  
Motivation: Analysis of large biological data sets using a variety of parallel processor computer architectures is a common task in bioinformatics. The efficiency of the analysis can be significantly improved by properly handling redundancy present in these data combined with taking advantage of the unique features of these compute architectures. Results: We describe a generalized approach to this analysis, but present specific results using the program CEPAR, an efficient implementation of the
more » ... Combinatorial Extension algorithm in a massively parallel (PAR) mode for finding pairwise protein structure similarities and aligning protein structures from the Protein Data Bank. CEPAR design and implementation are described and results provided for the efficiency of the algorithm when run on a large number of processors.
doi:10.1093/bioinformatics/bth184 pmid:15044237 fatcat:mv4x6dkgdfgprpfcmulel2fxyi