A comparative evaluation of procedural level generators in the Mario AI framework

Britton Horn, Steve Dahlskog, Noor Shaker, Gillian Smith, Julian Togelius
2014 International Conference on Foundations of Digital Games  
Evaluation is an open problem in procedural content generation research. The field is now in a state where there is a glut of content generators, each serving different purposes and using a variety of techniques. It is difficult to understand, quantitatively or qualitatively, what makes one generator different from another in terms of its output. To remedy this, we have conducted a large-scale comparative evaluation of level generators for the Mario AI Benchmark, a research-friendly clone of
more » ... classic platform game Super Mario Bros. In all, we compare the output of seven different level generators from the literature, based on different algorithmic methods, plus the levels from the original Super Mario Bros game. To compare them, we have defined six expressivity metrics, of which two are novel contributions in this paper. These metrics are shown to provide interestingly different characterizations of the level generators. The results presented in this paper, and the accompanying source code, is meant to become a benchmark against which to test new level generators and expressivity metrics.
dblp:conf/fdg/HornDSST14 fatcat:tgo7jwlcbfhw7agiksffm3ffpq