Comparing Methods for Estimating Marginal Likelihood in Symbolic Regression

Patrick Leser, Geoffrey Bomarito, Gabriel Kronberger, Fabrício Olivetti De França

Research output: Chapter in Book/Report/Conference proceedingsConference contributionpeer-review

Abstract

Marginal likelihood has been proposed as a genetic programming-based symbolic regression (GPSR) fitness metric to prevent overly complex expressions and overfitting, particularly when data is limited and noisy. Here, two particular methods for estimating marginal likelihood - the Laplace approximation and sequential Monte Carlo - are studied with a focus on tradeoffs between accuracy and computational efficiency. The comparison focuses on practical challenges in the context of two sets of example problems. First, the methods are compared on handcrafted expressions exhibiting nonlinearity and multimodality in their respective posterior distributions. Next, the methods are compared on a real-world set of equations produced by GPSR using training data from a well-known symbolic regression benchmark. A key finding is that there are potentially significant differences between the methods that, for example, could lead to conflicting selection of expressions within a GPSR implementation. However, it is concluded that there are scenarios where either method could be preferred over the other based on accuracy or computational budget. Algorithmic improvements for both methods as well as future areas of study are discussed.

Original languageEnglish
Title of host publicationGECCO 2024 Companion - Proceedings of the 2024 Genetic and Evolutionary Computation Conference Companion
PublisherAssociation for Computing Machinery, Inc
Pages2058-2066
Number of pages9
ISBN (Electronic)9798400704956
DOIs
Publication statusPublished - 14 Jul 2024
Event2024 Genetic and Evolutionary Computation Conference Companion, GECCO 2024 Companion - Melbourne, Australia
Duration: 14 Jul 202418 Jul 2024

Publication series

NameGECCO 2024 Companion - Proceedings of the 2024 Genetic and Evolutionary Computation Conference Companion

Conference

Conference2024 Genetic and Evolutionary Computation Conference Companion, GECCO 2024 Companion
Country/TerritoryAustralia
CityMelbourne
Period14.07.202418.07.2024

Keywords

  • equation learning
  • marginal likelihood
  • model selection
  • symbolic regression

Fingerprint

Dive into the research topics of 'Comparing Methods for Estimating Marginal Likelihood in Symbolic Regression'. Together they form a unique fingerprint.

Cite this