Genetic Symbolic Regression via Prior Knowledge Backpropagation

  • Agostino Rizzo

Student thesis: Master's Thesis

Abstract

Contrary to many Machine Learning tasks, the major interest of Symbolic Regression (SR) is the induction of fully interpretable models from a set of numerical data points. Due to interpretability, models are represented as analytical expressions to describe the phenomenon behind the data. In a traditional scenario, SR is often performed using a biologically inspired methodology called Genetic Programming (GP) where candidate models are evolved with the objective to optimize some fitting measure over the dataset. Nevertheless, especially in physical domains, access to data or observations could be unpractical, costly, and sometimes even dangerous. In some cases, the precision of the utilized instruments leads to particularly noisy datasets with observations that are not uniformly distributed among the whole input domain. However, to enforce the plausibility of the model as well as its extrapolation capability, prior knowledge can be provided as constraints about the image, monotonicity, and concavity of the function.
Recently, several works have been proposed with the objective of integrating prior knowledge about the model to be sought into a regression algorithm. They showed relevant results and, most importantly, the benefit of prior knowledge in
finding good models when dealing with small and/or noisy datasets. However, in the literature, the field still contains unclear facets. In this thesis work, we propose the novel knowledge backpropagation (KBP) technique as a possible solution to improve the satisfiability of candidate solutions in a genetic programming algorithm without necessarily affecting the design of the fitness function. The proposed approach has been implemented and its impact, in solving symbolic regression problems with prior knowledge, has been assessed by experimenting the system with several regression benchmarks. Results show the ability of the approach to discover feasible solutions with better model accuracy compared to the classical genetic programming algorithm.
Date of Award2024
Original languageEnglish (American)
SupervisorMichael Affenzeller (Supervisor) & Simona Perri (Supervisor)

Cite this

'