Exascale Computing: Challenges and Opportunities for Applied Mathematicians and Engineers

Ravi Samtaney
2012 Journal of Applied & Computational Mathematics  
Computational science has emerged as the third leg of the scientific enterprise alongside theory and experimentation. Since the advent of the digital computer, followed by parallel computing we have made tremendous strides in many scientific areas. However, hardware advances alone were not responsible; significant advances in algorithms also accompanied increases in hardware. As Moore's law flattens out, we expect the next advances in hardware will be in the spirit of "many cores on a chip",
more » ... ., massive parallelism. We are soon to be in the throes of another revolution in parallel computing in which O(108) cores or more will become the norm at a few select high-performance computing centers at first and then these systems will likely proliferate to other supercomputing facilities around the world. Welcome to the exascale! One motivation for seeking to reach exascale (exa=1018) computing, beyond conveying the sentiment in Mallory's comment "because it's there" about the desire to climb Everest, is to overcome or mitigate the tyranny of scales so prevalent in multiphysics applications. Furthermore, one of the central tenets of exascale computing is to transform modeling and simulation in science and engineering to a genuine science-based predictive discipline. The US Department of Energy is one government entity that has paid significant attention to this upcoming revolution. It has organized several workshops over the past few years [1]. These workshops have identified several areas of science where exascale computing will push the frontiers in a transformative fashion. These science areas include climate science, materials, renewable energy, biology, socioeconomic modeling etc. Research opportunities within energy and environment science areas for example are discussed in reference [2] . Today, it is not unusual to encounter decades old numerical kernels in many multiphysics simulation codes, and many of these kernels were written first on serial computers, then ported to vector architectures and finally parallelized in an ad-hoc fashion for parallel architectures. Even when the number of processors increased to O (103), these numerical kernels survived. This modus operandi cannot be extrapolated to the exascale. There are a number of reasons for the previous claim. Going to the exascale will require radical changes in hardware, and a thorough understanding of hardware architecture in programming applications to effectively use tens of millions of cores. The new mantra is that while physical memory is cheap, accessing memory is extremely expensive. Codes will have to be rewritten to make most effective use of cache and other architectural design features and heterogeneity (for example a mix of CPUs and GPUs). Particular attention will have to be paid to Amdahl's law. Another important consideration is fault-tolerance so that in the event of hardware failure of a single core, the code can recover gracefully. Yet another important consideration will be "green algorithms", i.e., algorithms that pay attention to not only data movement and placement but also reducing power consumption. A new theme that has emerged is that of co-design wherein both software and hardware designs are performed in a coupled fashion to implement a particular function or application. An example of co-design is in the area of climate modeling is discussed in reference [3] . In all associated areas of exascale computing, applied mathematicians and computational scientists can find several research opportunities. Let us consider a simple task. Suppose one would like to develop a Newton-Krylov type implicit algorithm to solve a system of partial differential equations exhibiting multiple time and spatial scales [4] . The core of a Newton-Krylov consists of a nonlinear function evaluation, which is also used to compute the Jacobianvector product during the Krylov iterative solution step. Usually some form of preconditioning also becomes necessary. Now imagine implementing this algorithm and computing on tens of millions of cores. The nonlinear function evaluation has to pay special attention to the deep memory hierarchy, and the linear solver (which dominates the computing cost as the problem size gets larger) needs to written in a fault-tolerant, communication minimizing and synchronization-hiding fashion [5] . Apart from coming up with clever heuristics and ad-hoc techniques to keep computational scientists busy, applied mathematics will be engaged to prove theorem about optimality, or lack thereof, and asymptotic efficiencies of proposed algorithms. Clearly is enough work here to keep a whole generation of applied mathematicians and computational scientist busy in the worthwhile endeavor of exascale computing.
doi:10.4172/2168-9679.1000e126 fatcat:n336lfu4vbcz5i7szwndpoeifq