Code bloat is a phenomenon in genetic programming that increases solution size without improvement in solution quality. In this thesis, the fundamentals of tree-based genetic programming and several existing measures to counteract bloat are reviewed. One such measure is pruning, which removes nodes or entire subtrees in syntax trees based on their estimated impact on solution quality. HeuristicLab, a framework for heuristic optimization, implements a pruning analyzer that is tested in this thesis. The research question is to determine the usability of the pruning analyzer and its impact on solution quality, solution size, generalization, and execution time. Tests were performed on one real-world and one artificial problem. Three different algorithms are used for each test: standard genetic algorithm (GA), offspring selection genetic algorithm (OSGA), and nondominated sorting genetic algorithm II (NSGA-II). The results presented in this thesis show, that pruning has a significant impact on the produced solutions. It is able to increase solution quality, but can also be used to decrease solution size on the cost of solution quality. HeuristicLab’s pruning analyzer offers multiple settings to adjust pruning behavior. This is great for fine-tuning settings, however, it is difficult to achieve good results without testing different parameter settings. The behavior is very unpredictable and depends not only on the pruning configuration but also on the problem and the maximum allowed tree size.
Continuous Pruning in Tree-Based Genetic Programming
Kefer, C. (Author). 2025
Student thesis: Master's Thesis