TY - CONF
T1 - Multiview Symbolic Regression
AU - Russeil, Etienne
AU - de Franca, Fabricio Olivetti
AU - Malanchev, Konstantin
AU - Burlacu, Bogdan
AU - Ishida, Emille
AU - Leroux, Marion
AU - Michelin, Clément
AU - Moinard, Guillaume
AU - Gangler, Emmanuel
N1 - Publisher Copyright:
© 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.
PY - 2024/7/14
Y1 - 2024/7/14
N2 - Symbolic regression (SR) searches for analytical expressions representing the relationship between explanatory and response variables. Current SR methods assume a single dataset extracted from a single experiment. Nevertheless, frequently, the researcher is confronted with multiple sets of results obtained from experiments conducted with different set-ups. Traditional SR methods may fail to find the underlying expression since the parameters of each experiment can be different. In this work we present Multiview Symbolic Regression (MvSR), which takes into account multiple datasets simultaneously, mimicking experimental environments, and outputs a general parametric solution. This approach fits the evaluated expression to each independent dataset and returns a parametric family of functions f(x; ?) simultaneously capable of accurately fitting all datasets. We demonstrate the effectiveness of MvSR using data generated from known expressions, as well as real-world data from astronomy, chemistry and economy, for which an a priori analytical expression is not available. Results show that MvSR obtains the correct expression more frequently and is robust to hyperparameters change. In real-world data, it is able to grasp the group behaviour, recovering known expressions from the literature as well as promising alternatives, thus enabling the use MvSR to a large range of experimental scenarios.
AB - Symbolic regression (SR) searches for analytical expressions representing the relationship between explanatory and response variables. Current SR methods assume a single dataset extracted from a single experiment. Nevertheless, frequently, the researcher is confronted with multiple sets of results obtained from experiments conducted with different set-ups. Traditional SR methods may fail to find the underlying expression since the parameters of each experiment can be different. In this work we present Multiview Symbolic Regression (MvSR), which takes into account multiple datasets simultaneously, mimicking experimental environments, and outputs a general parametric solution. This approach fits the evaluated expression to each independent dataset and returns a parametric family of functions f(x; ?) simultaneously capable of accurately fitting all datasets. We demonstrate the effectiveness of MvSR using data generated from known expressions, as well as real-world data from astronomy, chemistry and economy, for which an a priori analytical expression is not available. Results show that MvSR obtains the correct expression more frequently and is robust to hyperparameters change. In real-world data, it is able to grasp the group behaviour, recovering known expressions from the literature as well as promising alternatives, thus enabling the use MvSR to a large range of experimental scenarios.
KW - genetic programming
KW - interpretability
KW - symbolic regression
UR - http://www.scopus.com/inward/record.url?scp=85206900826&partnerID=8YFLogxK
U2 - 10.1145/3638529.3654087
DO - 10.1145/3638529.3654087
M3 - Paper
SP - 961
EP - 970
ER -