BATRUN DPS: MULTI-CELL, FAULT-TOLERANT, PARALLEL, BATCH PROCESSING FOR MONTE CARLO SIMULATIONS

FREDY TANDIARY, SURAJ C. KOTHARI, ASHISH DIXIT, E. WALTER ANDERSON
1996 Computing in High Energy Physics '95  
This paper discusses the design of the BATRUN Distributed Processing System (DPS), In contrast to a dedicated cluster of workstations, the scheduling in BATRUN DPS must ensure that only the idle cycles are used for distributed computing and the Iocat users, when they are operating, have full control of their machines. BATRUN DPS has sevemf unique featuses: group-based scheduling policy to ensure execution priority based on ownership of machines, and multi-cell distributed design to eliminate a
more » ... ingle point faihue as well as to ensure scalability. The implementation of the system is based on multithreading and remote procedure call mechanisms.
doi:10.1142/9789814447188_0041 fatcat:xupvencvybemfchhb4iyllevoe