Small and medium-sized enterprises (SMEs) are the backbone of the Austrian economy, representing 99.8% of all businesses. They make a significant contribution to economic value creation and employment, but at the same time, they are disproportionately vulnerable to insolvency. This increased vulnerability can be attributed to structural weaknesses such as limited capital resources, low financial transparency and high sensitivity to macroeconomic fluctuations. The rising number of corporate bankruptcies since the COVID-19 pandemic highlights the importance of precise forecasting tools specifically tailored to SMEs. Early identification of insolvencies is therefore crucial for business decision-makers, investors, financial institutions and economic policy. Despite their economic importance, there are still few empirically grounded studies on insolvency forecasting for Austrian SMEs. However, in international comparison, predictive models have been established since the 1960s, which increasingly integrate machine learning methods and conduct systematic comparisons between traditional statistical methods and modern machine learning techniques. This master thesis focuses exactly on this systematic comparison, beginning with an exploration of theoretical foundations relevant to the development of an insolvency forecasting model. This is followed by an empirical analysis of insolvency predictions for Austrian SMEs in the construction industry. The data source consists of an extensive dataset from the credit protection association 1870 (KSV1870) covering the period from 1996 to 2023, which includes 24,894 companies, of which 1,990 are insolvency cases. The analysis covers four forecasting periods: 1 year, 2-3 years, 4-5 years, and 6-10 years before insolvency. To evaluate model performance, six different models are used: Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), Gradient Boosting (GB), Support Vector Machine (SVM), and Neural Networks (NN). The empirical analysis demonstrates that data preparation is a crucial step in the modeling process. By selectively eliminating outliers using the triple interquartile range, the prediction accuracy can be increased by an average of 9.35 percentage points (PP) compared to modeling without outliers. In addition to inconsistencies in the data, the limited disclosure requirements for micro and small businesses pose a significant limitation. This is reflected in a highly aggregated balance sheet, which limits the pool of financial variables. As a result, only seven indicators could be calculated, with their impact varying depending on the model and forecasting horizon. Nevertheless, a consistent trend of company erosion up to insolvency is observable across all horizons. The average equity ratio for insolvent companies in the year before insolvency is -26.9%. Due to the strong limitation of the indicators, the analysis shows that feature selection methods (filter and wrapper methods) no longer provide significant added value. A separate analysis of the integration of non-financial variables (region, company age) demonstrates an improvement in prediction accuracy. However, macroeconomic variables do not appear to have any noticeable impact on model performance. In the model comparison, LR consistently performs worse than the machine learning methods. Among the machine learning methods, DT lags significantly, while GB achieves the best results in most forecasting horizons. SVM is convincing only in a few cases, while NN and RF deliver comparable results in the one-year horizon.
| Date of Award | 2025 |
|---|
| Original language | German (Austria) |
|---|
| Supervisor | Stefan Fink (Supervisor) |
|---|
- Controlling, Accounting and Financial Management
Predictive Analytics für KMU-Insolvenzen in Österreich: ein Methodenvergleich von klassischen und modernen Machine-Learning Ansätzen unter Berücksichtigung von finanziellen, nicht-finanziellen und makroökonomischen Variablen
Wallner, S. T. (Author). 2025
Student thesis: Master's Thesis