Schema Analysis in Tree-based Genetic Programming

Research output: Chapter in Book/Report/Conference proceedingsChapter


In this chapter we adopt the concept of schemata from schema theory and use it to analyze population dynamics in genetic programming for symbolic regression. We define schemata as tree-based wildcard patterns and we empirically measure their frequencies in the population at each generation. Our methodology consists of two steps: in the first step we generate schemata based on genealogical information about crossover parents and their offspring, according to several possible schema definitions inspired from existing literature. In the second step, we calculate the matching individuals for each schema using a tree pattern matching algorithm.We test our approach on different problem instances and algorithmic flavors and we investigate the effects of different selection mechanisms on the identified schemata and their frequencies.
Original languageEnglish
Title of host publicationGenetic Programming in Theory and Practice XV
Publication statusPublished - 2018


  • genetic programming
  • schema analysis
  • population diversity
  • tree pattern matching
  • symbolic regression


Dive into the research topics of 'Schema Analysis in Tree-based Genetic Programming'. Together they form a unique fingerprint.

Cite this