TY - GEN
T1 - The Inefficiency of Genetic Programming for Symbolic Regression
AU - Kronberger, Gabriel
AU - Olivetti de Franca, Fabricio
AU - Desmond, Harry
AU - Bartlett, Deaglan J.
AU - Kammerer, Lukas
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024/9
Y1 - 2024/9
N2 - We analyse the search behaviour of genetic programming (GP) for symbolic regression (SR) in search spaces that are small enough to allow exhaustive enumeration, and use an improved exhaustive symbolic regression algorithm to generate the set of semantically unique expression structures, which is orders of magnitude smaller than the original SR search space. The efficiency of GP and a hypothetical random search in this set of unique expressions is compared, whereby the efficiency is quantified via the number of function evaluations performed until a given error threshold is reached, and the percentage of unique expressions evaluated during the search after simplification to a canonical form. The results for two real-world datasets with a single input variable show that GP in such limited search space explores only a small fraction of the search space, and evaluates semantically equivalent expressions repeatedly. GP has a smaller success probability than the idealised random search for such small search spaces.
AB - We analyse the search behaviour of genetic programming (GP) for symbolic regression (SR) in search spaces that are small enough to allow exhaustive enumeration, and use an improved exhaustive symbolic regression algorithm to generate the set of semantically unique expression structures, which is orders of magnitude smaller than the original SR search space. The efficiency of GP and a hypothetical random search in this set of unique expressions is compared, whereby the efficiency is quantified via the number of function evaluations performed until a given error threshold is reached, and the percentage of unique expressions evaluated during the search after simplification to a canonical form. The results for two real-world datasets with a single input variable show that GP in such limited search space explores only a small fraction of the search space, and evaluates semantically equivalent expressions repeatedly. GP has a smaller success probability than the idealised random search for such small search spaces.
KW - Genetic Programming
KW - Search space
KW - Symbolic Regression
UR - http://www.scopus.com/inward/record.url?scp=85204543306&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-70055-2_17
DO - 10.1007/978-3-031-70055-2_17
M3 - Conference contribution
AN - SCOPUS:85204543306
SN - 9783031700545
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 273
EP - 289
BT - Parallel Problem Solving from Nature – PPSN XVIII - 18th International Conference, PPSN 2024, Proceedings
A2 - Affenzeller, Michael
A2 - Winkler, Stephan M.
A2 - Kononova, Anna V.
A2 - Bäck, Thomas
A2 - Trautmann, Heike
A2 - Tušar, Tea
A2 - Machado, Penousal
PB - Springer
T2 - 18th International Conference on Parallel Problem Solving from Nature, PPSN 2024
Y2 - 14 September 2024 through 18 September 2024
ER -