There are several methods that are frequently used for solving data based system identification problems; genetic programming (GP) has already been used successfully for solving data mining problems in the context of several scientific domains. Extended functional bases, additional optimization phases and further developed selection mechanisms essentially contribute to the method's ability to generate high quality results for various kinds of data based identification scenarios. Even though there has already been a lot of investigation regarding the optimization of the method and its parameter settings, there is still rather little systematic analysis of internal processes regarding genetic dynamics and the progress of genetic diversity during the execution of genetic programming based identification using these algorithmic extensions. In this paper, we report on results of investigations regarding exactly these aspects: We have developed methods and statistical features that are able to describe genetic diversity and dynamics of GP-based structure identification algorithms; here, we introduce statistic analysis of genetic diversity regarding variables and time offset settings within GP populations. Genetic diversity is (amongst other aspects) characterized by the occurrence of variables for the models in which they are used; statistical methods for estimating respective impact features are also presented here. Data sets representing two different kinds of systems (complex mechatronical systems as well as medical benchmark data) have been used for empirical tests; furthermore, standard implementations of genetic programming are compared to extended techniques including offspring selection as well as sliding window techniques.
|Number of pages||8|
|Publication status||Published - 2008|
- Genetic diversity
- Genetic programming
- Systems identification