Pattern Mining and Genetic Improvement in Compilers and Interpreters

Research output: Types of ThesesDoctoral Thesis


Writing source code is a challenging task, requiring the understanding of complex concepts, algorithms and programming paradigms. This task becomes increasingly challenging when source code has to be optimized for non-functional properties such as run-time performance, memory usage or energy efficiency. These properties often depend on in-depth knowledge of the language, the compiler and even the hardware architecture the source code will be run on. This thesis deals with the identification and verification of patterns in software that influence such non-functional properties. The goal is to help get an understanding, which patterns should be considered or avoided when optimizing towards a certain feature, and where to apply these patterns in the software. This is achieved by the novel combination of two distinct fields of research. Software Pattern Mining and Genetic Improvement, a Search- Based Software Engineering technique. The approach is applied directly at the compiler and interpreter level. This enables mining of a fine-granular software representation, combining it with in-depth information about the language, and gathering fine-granular performance measurements. A novel algorithm Knowledge-guided Genetic Improvement is presented that allows the generation of multiple semantically equivalent versions of software. These are then mined for discriminative patterns via the novel Independent Growth of Ordered Relationships algorithm. Several bug patterns, often occurring in the mutation operation of Genetic Improvement (GI), have successfully been identified and proven to produce bugs with an average confidence of 90.1%. Fix patterns reduce these bugs with a confidence of 94%. Identified patterns have been successfully applied in the field of Genetic Improvement by doubling population diversity, and reducing failing individuals in the population to 36.9% compared to 80% identified in related work. This allows Knowledge-guided Genetic Improvement to improve the run-time performance of 22 out of 25 algorithms by an average of 33.5%. The work also successfully identifies anti-patterns and patterns in the run-time domain that are responsible for this improvement. This thesis lays a foundation for identifying and verifying patterns in the non-functional domain. As a side effect, it also makes the results of experiments in the domain of Genetic Improvement explainable. This opens up opportunities to drive research in the domains Software Pattern Mining and Genetic Improvement even further. In the future, identified patterns may even be directly applicable in a compiler or interpreter.
Original languageEnglish
QualificationDr. techn.
Awarding Institution
  • Mössenböck, Hanspeter, Supervisor, External person
Award date28 Apr 2022
Publication statusPublished - 28 Apr 2022


Dive into the research topics of 'Pattern Mining and Genetic Improvement in Compilers and Interpreters'. Together they form a unique fingerprint.

Cite this