GridR: An R-Based Grid-Enabled Tool for Data Analysis in ACGT Clinico-Genomics Trials
Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007)
This paper presents GMLE 1 , a generic and distributed framework for maximum likelihood evaluation. GMLE is currently being applied to astroinformatics for determining the shape of star streams in the Milky Way galaxy, and to particle physics in a search for theory-predicted but yet unobserved sub-atomic particles. GMLE is designed to enable parallel and distributed executions on platforms ranging from supercomputers and high-performance homogeneous computing clusters to more heterogeneous Grid
... and Internet computing environments. GMLE's modular implementation seperates concerns of developers into the distributed evaluation frameworks, scientific models, and search methods, which interact through a simple API. This allows us to compare the benefits and drawbacks of different scientific models using different search methods on different computing environments. We describe and compare the performance of two implementations of the GMLE framework: an MPI version that more effectively uses homogeneous environments such as IBM's BlueGene, and a SALSA version that more easily accommodates heterogeneous environments such as the Rensselaer Grid. We have shown GMLE to scale well in terms of computation as well as communication over a wide range of environments. We expect that scientific computing frameworks, such as GMLE, will help bridge the gap between scientists needing to analyze ever larger amounts of data and ever more complex distributed computing environments.