Experimental analysis of logical process simulation algorithms in JAMES II
Proceedings of the 2009 Winter Simulation Conference (WSC)
The notion of logical processes is a widely used modeling paradigm in parallel and distributed discrete-event simulation. Yet, the comparison among different simulation algorithms for LP models still remains difficult. Most simulation systems only provide a small subset of available algorithms, which are usually selected and tuned towards specific applications. Furthermore, many modeling and simulation frameworks blur the boundary between model logic and simulation algorithm, which hampers the
... xtensibility and the comparability. Based on the general-purpose modeling and simulation framework JAMES II, which has already been used for experiments with algorithms several times, we present an environment for the experimental analysis of simulation algorithms for logical processes. It separates model from simulator concepts, is extensible (in regards to the benchmark models, the algorithms used, etc.), and facilitates a fair comparison of algorithms. 1167 978-1-4244-5771-7/09/$26.00 ©2009 IEEE Wang, Yao, Himmelspach, Ewald, and Uhrmacher BACKGROUND AND MOTIVATION Over the last decades, many solutions for the simulation of LP-based models have been proposed (Fujimoto 2000) . They usually aim at overcoming the performance bottleneck of a sequential discrete-event simulation (Misra 1986), i.e., to achieve a speed-up in comparison to a sequential simulation. There are several associated techniques, for example, partitioning and synchronization, which are two important issues that strongly influence the performance of a parallel and distributed simulation: A partitioning policy has to decide how the LPs, i.e., the model entities, are mapped onto the available processors that shall execute them. This is a fundamental problem in parallel and distributed computing (Hendrickson and Kolda 2000) . Of even more importance is the synchronization protocol. Since LPs are executed concurrently in a PDES, communication among them has to be synchronized. The existing protocols can be characterized as conservative, optimistic, or hybrid (Jha and Bagrodia 1994, Perumalla 2005), and they have a strong impact on PDES performance. More details on synchronization are given in Section 5. Many existing PDES solutions have been proposed for specific hardware architectures, e.g., (Jefferson et al. 1987 , (Fujimoto and Hybinette 1997 , Chen and Szymanski 2005 , Perumalla 2007 ). Tuning algorithms towards specific hardware hampers the reproducibility of obtained performance results on other platforms, but is often inevitable for delivering good performance for the application at hand. Today's hardware architectures with their long CPU pipelines and multiple cache layers exhibit non-trivial runtime behavior. This makes the performance analysis of algorithms executed on them a non-trivial task (McGeoch 2007) . Consequently, computational complexity theory is often insufficient for assessing an algorithm's real-world performance (LaMarca and Ladner 1997). Still, the theoretical analysis of algorithms may provide valuable insights regarding certain design decisions, although it usually requires strong assumptions and the results are rather general or inconclusive (e.g., (Nicol 1998, Gupta, Akyldiz, and Fujimoto 1991) ). The importance of empirical algorithm performance analysis has been stressed by the "Experimental Algorithmics" community over the last years, which deals with its methodological issues (McGeoch 2007 , Johnson 2002 . Following their guidelines regarding a fair comparison of algorithms, it is essential to compare all algorithms based on the same benchmark models, and to only compare algorithms that are executed on the same platform. Moreover, a single implementation should not be trusted too much -it might turn out to be erroneous if thoroughly validated. In addition, it could rely on some operation that acts as a performance bottleneck -which could be avoided, and which thus renders the results achieved not to be valid in general. This is particularly true for parallel simulation algorithms, as they are challenging to implement and inconsistencies can be very hard to find. To make things worse, most research papers present new algorithms compared to only one or two old ones, and the researchers might simply have spent more time on optimizing their own algorithm. All these inaccuracies are hard to avoid and may shed doubt on the obtained results. Hence, it is neither sufficient to only compare two algorithms in theory, nor to compare their performance across different platforms, programming languages, and simulation systems -which is a central motivation for an experimental environment that ensures a fair comparison. Recent emergence of new application demands, techniques, and hardware platforms results in enhancements to traditional techniques and the formulation of new methods for parallel and distributed simulation (Perumalla 2006) . Thus, research on PDES algorithms could benefit from a flexible and efficient way to evaluate and compare old and new PDES algorithms. JAMES II provides a fixed experimental setup and the extensibility for new algorithms by design, and thus forms a solid base for the experimental analysis of simulation algorithms (Himmelspach and Uhrmacher 2007b). Anyone can add an implementation of a certain algorithm to the system and compare it to the performance of existing alternatives. Thus developers can concentrate on "their" algorithms, and while time passes the list of available, and hopefully tuned, implementations increases constantly. In addition anyone can add new benchmark models: the performance of a certain algorithm may strongly depend on the concrete input (here: a model) it is applied to. The more and better benchmark models there are, the more reliable are the experimental results. Still, in the first instance the most commonly used models should already be provided -only this makes it possible to compare newly obtained results from our framework to those that are already documented in the literature. Detailed descriptions of the experimental setup form the base for a sound experimental analysis, which is provided by the JAMES II experimentation layer (Himmelspach, Ewald, and Uhrmacher 2008). These details make it easier to discuss and compare results, and may make them "future safe", i.e., a reliable source of information for future research. Especially the latter aspect requires the usage of abstract, normalized performance measurement values instead of providing just some general information about the platforms used (Johnson 2002) . We believe that work on re-evaluation and standardized benchmarking may help to ensure the research quality of the PDES community, an important issue that has recently been recognized as crucial for the M&S community in general (Smith et al. 2008) .