On the Effectiveness of Sampling for Evolutionary Optimization in Noisy Environments
Lecture Notes in Computer Science
Sampling has been often employed by evolutionary algorithms to cope with noise when solving noisy real-world optimization problems. It can improve the estimation accuracy by averaging over a number of samples, while also increasing the computation cost. Many studies focused on designing efficient sampling methods, and conflicting empirical results have been reported. In this paper, we investigate the effectiveness of sampling in terms of rigorous running time, and find that sampling can be
... ective. We provide a general sufficient condition under which sampling is useless (i.e., sampling increases the running time for finding an optimal solution), and apply it to analyzing the running time performance of (1+1)-EA for optimizing OneMax and Trap problems in the presence of additive Gaussian noise. Our theoretical analysis indicates that sampling in the above examples is not helpful, which is further confirmed by empirical simulation results. ⋆ Aizawa and Wah  suggested two adaptive sampling methods: increasing the sample size with the generation number and allocating larger sample size for solutions with larger estimated variance. Stagge  used a larger sample size for better solutions. Several sequential sampling approaches [7, 8, 10] were later proposed for tournament selection, which first estimate the fitness of two solutions by a small number of samples, and then sequentially increase samples until the difference can be significantly discriminated. Adaptive sampling was then incorporated into diverse metaheuristic algorithms (e.g., immune algorithm , particle swarm optimization  and compact differential evolution  ) to efficiently cope with noise. It has also been employed by evolutionary algorithms for noisy multi-objective optimization [20, 23, 25] . Based on the assumption that the fitness landscape is locally smooth, an alternative approach to approximately increase the estimation accuracy without increasing the sampling cost was proposed [9, 22] , which estimates the fitness of a solution by averaging the fitness of previously evaluated neighbors. Sampling has been shown to be able to improve the local performance of EAs (e.g., increase the probability of selecting the true better solution in tournament selection ). A practical performance measure of an algorithm is how much time it needs to find a desired solution. On this measure, conflicting conclusions about sampling have been empirically reported. For example, in , it was shown that sampling can speed up a standard genetic algorithm on two test functions; while in , sampling led to a larger computation time for a simple generational genetic algorithm on the OneMax function. In this paper, we investigate the effectiveness of sampling via rigorous running time analysis, which measures how soon an algorithm can solve a problem (i.e., the number of fitness evaluations until finding an optimal solution) and has been a leading theoretical aspect for randomized search heuristics [3, 19] . We provide a sufficient condition under which sampling is useless (i.e., sampling increases the running time). Applying it to analyze (1+1)-EA solving the Noisy OneMax and the Noisy Trap problems with the additive Gaussian noise, we disclose that the sampling is ineffective in the two cases for different reasons. The derived theoretical results are also empirically verified. The results may help understand the effect of noise and design better strategies for handling noisy fitness functions. The rest of this paper is organized as follows. Section 2 introduces some preliminaries. Section 3 presents the main theorem, which is then used in case studies in Section 4. Section 5 concludes the paper and discusses future work. Preliminaries Sampling and Optimization in the Presence of Noise An optimization problem can be generally represented as arg max x∈X f (x), where X is the feasible solution space and the objective f is also called fitness in the context of evolutionary computation. In real-world optimization tasks, the fitness evaluation for a solution is usually disturbed by noise due to a wide range of uncertainties (e.g., randomized simulations), and consequently we can not obtain the exact fitness value but only a noisy one. A commonly studied noise model is additive noise as presented in Definition 1, which will also be adopted in this paper.