TY - GEN
T1 - Surrogates for Fair-Weather Photovoltaic Module Output
AU - Falkner, Dominik
AU - Bögl, Michael
AU - Langthallner, Ines
AU - Zenisek, Jan
AU - Affenzeller, Michael
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - In the field of time series analysis, the scarcity of comprehensive datasets poses a significant challenge for the development of reliable predictive models. This study addresses the difficulties in forecasting solar module outputs and enhancing data accessibility for modeling, especially in residential sectors. We propose a general method to establish a distribution of photovoltaic module parameters across a country and, from this, generate a synthetic dataset for simulation and modeling pv module output. This approach integrates multiple freely available data sources. The study is focused on Germany, utilizing the Marktstammdatenregister as its main source for the module parameter distribution. The data is then enriched using publically available data. Based upon this, a crawler is developed to gather fair-weather module outputs from the Photovoltaic Geographical Information System for training, testing, and benchmarking purposes. One benchmark has fixed locations and the second one has fixed module parameters. Additionally, we provide a data loader with artificial degradation for all datasets. In the last step we test multiple state of the art models on the dataset and show that the proposed forecasting task is not trivial. All the code and data is publically available.
AB - In the field of time series analysis, the scarcity of comprehensive datasets poses a significant challenge for the development of reliable predictive models. This study addresses the difficulties in forecasting solar module outputs and enhancing data accessibility for modeling, especially in residential sectors. We propose a general method to establish a distribution of photovoltaic module parameters across a country and, from this, generate a synthetic dataset for simulation and modeling pv module output. This approach integrates multiple freely available data sources. The study is focused on Germany, utilizing the Marktstammdatenregister as its main source for the module parameter distribution. The data is then enriched using publically available data. Based upon this, a crawler is developed to gather fair-weather module outputs from the Photovoltaic Geographical Information System for training, testing, and benchmarking purposes. One benchmark has fixed locations and the second one has fixed module parameters. Additionally, we provide a data loader with artificial degradation for all datasets. In the last step we test multiple state of the art models on the dataset and show that the proposed forecasting task is not trivial. All the code and data is publically available.
KW - distributions
KW - machine learning
KW - photovoltaic
KW - pvgis
KW - surrogate
UR - http://www.scopus.com/inward/record.url?scp=105004407437&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-83885-9_15
DO - 10.1007/978-3-031-83885-9_15
M3 - Conference contribution
AN - SCOPUS:105004407437
SN - 9783031838873
T3 - Lecture Notes in Computer Science
SP - 154
EP - 166
BT - Computer Aided Systems Theory – EUROCAST 2024 - 19th International Conference, 2024, Revised Selected Papers
A2 - Quesada-Arencibia, Alexis
A2 - Affenzeller, Michael
A2 - Moreno-Díaz, Roberto
PB - Springer
T2 - 19th International Conference on Computer Aided Systems Theory, EUROCAST 2024
Y2 - 25 February 2024 through 1 March 2024
ER -