TY - GEN
T1 - GECCO’2022 Symbolic Regression Competition
T2 - 2023 Genetic and Evolutionary Computation Conference Companion, GECCO 2023 Companion
AU - Burlacu, Bogdan
N1 - Publisher Copyright:
© 2023 Copyright held by the owner/author(s).
PY - 2023/7/15
Y1 - 2023/7/15
N2 - Operon is a C++ framework for symbolic regression with the ability to perform local search by optimizing model coefficients using the Levenberg-Marquardt algorithm. This enhancement has proven to be effective in a variety of regression tasks. Operon took part in the Interpretable Symbolic Regression for Data Science hosted at the 2022 Genetic and Evolutionary Computation Conference, where it ranked overall 4th based on criteria of accuracy, simplicity as well as task-specific goals. Although accurate, the returned models were exceedingly complex and ranked poorly in terms of simplicity. In this paper, we investigate the application of the Minimum Description Length (MDL) principle for selecting models with a better compromise between accuracy and complexity from the final Pareto front returned by the algorithm. A new experiment on the synthetic track of the competition highlights the critical role played by model selection in algorithm performance. The MDL-enhanced approach obtains the best overall score and demonstrates excellent results on all synthetic tracks.
AB - Operon is a C++ framework for symbolic regression with the ability to perform local search by optimizing model coefficients using the Levenberg-Marquardt algorithm. This enhancement has proven to be effective in a variety of regression tasks. Operon took part in the Interpretable Symbolic Regression for Data Science hosted at the 2022 Genetic and Evolutionary Computation Conference, where it ranked overall 4th based on criteria of accuracy, simplicity as well as task-specific goals. Although accurate, the returned models were exceedingly complex and ranked poorly in terms of simplicity. In this paper, we investigate the application of the Minimum Description Length (MDL) principle for selecting models with a better compromise between accuracy and complexity from the final Pareto front returned by the algorithm. A new experiment on the synthetic track of the competition highlights the critical role played by model selection in algorithm performance. The MDL-enhanced approach obtains the best overall score and demonstrates excellent results on all synthetic tracks.
KW - bayesian information criterion
KW - interpretability
KW - minimum description length
KW - model selection
KW - overfitting
KW - symbolic regression
UR - http://www.scopus.com/inward/record.url?scp=85169062373&partnerID=8YFLogxK
U2 - 10.1145/3583133.3596390
DO - 10.1145/3583133.3596390
M3 - Conference contribution
AN - SCOPUS:85169062373
T3 - GECCO 2023 Companion - Proceedings of the 2023 Genetic and Evolutionary Computation Conference Companion
SP - 2412
EP - 2419
BT - GECCO 2023 Companion - Proceedings of the 2023 Genetic and Evolutionary Computation Conference Companion
PB - Association for Computing Machinery, Inc
Y2 - 15 July 2023 through 19 July 2023
ER -