DescriptionA distinguishing feature of symbolic regression using genetic programming is its ability to identify complex nonlinear white-box models. The white-box aspect is especially relevant in industrial applications where models are extensively scrutinized in order to gain knowledge about underlying processes. However, the potential of genetic programming is often diluted by the ambiguity and complexity of the models produced. In this keynote lecture we present several analysis methods with the common goal to enable better insights in the symbolic regression process and to produce models that are more understandable and show better generalization. In order to gain more information about the process we monitor and analyze the progress of population diversity, building block information, and even more general genealogy information. Furthermore, methods to analyze obtained results regarding several aspects such as model simplification, relevance of variables, node impacts, and network analysis are discussed. These analysis methods were applied on algorithms and results of industrial projects carried out by the HEAL group in recent years. In these projects we mainly focused on time-series modeling of exhaustions of Diesel combustion engines, regression modeling of a blast furnace, and the identification of symbolic classifiers for tumor markers and cancer prediction. All presented techniques are publicly available and have been implemented using the open source optimization system HeuristicLab designed and developed by our group.
|Period||10 May 2013|
|Event title||Genetic Programming Theory and Practice (GPTP 2013): null|
|Location||Ann Arbor, USA, United States|