Incorporating expert knowledge in evolutionary search

Michael D. Schmidt, Hod Lipson
2009 Proceedings of the 11th Annual conference on Genetic and evolutionary computation - GECCO '09  
We investigated several methods for utilizing expert knowledge in evolutionary search, and compared their impact on performance and scalability into increasingly complex problems. We collected data over one thousand randomly generated problems. We then simulated collecting expert knowledge for each problem by optimizing an approximated version of the exact solution. We then compared six different methods of seeding the approximate model in to the genetic program, such as using the entire
more » ... mate model at once or breaking it into pieces. Contrary to common intuition, we found that inserting the complete expert solution into the population is not the best way to utilize that information; using parts of that solution is often more effective. Additionally, we found that each method scaled differently based on the complexity and accuracy of the approximate solution. Inserting randomized pieces of the approximate solution into the population scaled the best into high complexity problems and was the most invariant to the accuracy of the approximate solution. Furthermore, this method produced the least bloated solutions of all methods. In general, methods that used randomized parameter coefficients scaled best with the approximate error, and methods that inserted entire approximate solutions scaled worst with the problem complexity. Symbolic regression [5] is a type of genetic program for searching the space of expressions computationally by minimizing various error metrics. Both the parameters and the form of the equation are subject to search. In symbolic regression, many initially
doi:10.1145/1569901.1570048 dblp:conf/gecco/SchmidtL09a fatcat:iat7eij46najje4ycy43nwmbgi