• Ei tuloksia

5. Forecasting airport passenger volumes during the pandemic

5.1 Accuracy of the models

One of the aims of this research was to evaluate the predictive power of selected forecasting methods during the pandemic. The models were first run with daily and monthly datasets prior to the COVID-19 crisis to understand their predicting power during regular times. Since the pandemic started to show its first signs at the end of 2020, the selected forecasting periods were the first three, six, and twelve months of 2019. The training period was then extended until September 2020, and forecasts were produced for the last three months of 2020. The monthly forecast results are shown in Table 8, where the most accurate method for each forecasting horizon is bolded.

Table 8. Forecasting accuracies with monthly data

Based on MAPE, all of the methods performed comparably well before the crisis. TBATS seemed to perform the best in the short term, while Prophet's performance was the worst compared to others. However, surprisingly, the longer the forecasting period, the lower the prediction error. SARIMA (ARIMA (0,1,1)(1,1,0)[12]) outperformed other methods in predic-tions both six months and one year ahead (see figure 24). The lower prediction errors during the summer months may explain the lower overall prediction error for the forecasts of six and twelve months for all other methods except TBATS, which performed better at the be-ginning and end of the year (see appendices 2 and 3 for detailed outputs).

Error measure ARIMA PROPHET TBATS MLP ELM ARIMA PROPHET TBATS MLP ELM ARIMAPROPHET TBATS MLP ELM MAPE .0446 .0753 .0332 .0479 .0700 .0242 .0401 .0430 .0296 .0452 .0294 .0437 .0459 .0306 .0424 MAE 71 370 120 453 52 976 76 478 111 699 39 391 64 901 77 650 49 288 75 942 49 346 73 635 86 217 52 645 73 340 RMSE 71 642 122 253 54 076 77 620 113 633 50 966 86 763 82 571 58 250 86 072 66 453 94 736 95 079 64 279 86 303

MAPE N/A 2.8563 N/A N/A .0998 - - - - - - - - -

-MAE N/A 394 709 N/A N/A 12 809 - - - - - - - - -

-RMSE N/A 422 911 N/A N/A 15 007 - - - - - - - - -

-3 MONTHS 6 MONTHS 12 MONTHS

Training set 01/2010 - 12/2018

Training set 01/2010 - 9/2020

Figure 24. Forecasting errors (MAPE) before COVID-19 in 2019 (monthly data)

Unsurprisingly, more traditional methods, SARIMA and TBATS, were unable to produce usable forecasts when the complex data was introduced, and forecasts were produced for October until December 2020, the months amid the COVID-19 crisis. Both methods re-turned negative values. Since Prophet first rere-turned negative values, its trend flexibility was tuned from 0.05 to 0.5 following an example on Prophet’s documentation (Prophet n.d.).

This resulted in Prophet producing too optimistic forecasts and thus resulted in MAPE 2.86 (286 %). However, it is to be noted that other settings were not modified, and thus, the real potential of Prophet was not fully covered – the method allows its user to manually adjust and add points where the trend changes occur, allowing more flexibility to account for sig-nificant events. While MLP produced forecasts with negative values, ELM returned surpris-ingly accurate results: a three-month forecast with 10% MAPE.

Figure 25 visually shows forecasts in both scenarios, prior and during COVID-19. The results indicate how traditional methods are powerful enough to produce high-quality forecasts during regular, stable times. However, they heavily underperform when data starts containing complexities. While MLP with five hidden nodes in one layer seemed too simple to forecast with complex data, an increase in the number of nodes increased forecasting accuracy. ELM provided promising results by its comparably high predicting power during the crisis. It should be noted that the results should be considered with caution since the models were not validated with other data than described above. Thus, with different training or testing sets, the results may be different.

Figure 25. Forecasts before and during COVID-19

In daily forecasts, the accuracies were more drifted away than with monthly values. Figure 26 illustrates how Prophet outperformed other methods with the lowest MAPE of 5,3 – 6,4 percent prior to the COVID-19. TBATS, which is able to handle multiple seasonalities, per-formed nearly as well as Prophet. As expected, SARIMA (ARIMA (5,1,2)(0,1,0)[365]) did not stand out in the results, perhaps since the data contains multiple seasonalities. How-ever, surprisingly, more sophisticated neural networks did perform even worse than the SARIMA model. The forecasting error of MLP increased significantly from 18,7 to 30,0 per-cent when the forecasting period was extended from 90 to 365 days.

Figure 26. Forecasting errors (MAPE) prior to COVID-19 crisis in 2019 (daily data)

When the training period was extended until the end of September, and forecasts were produced for the last three months of 2020, only traditional methods could produce non-negative forecasts. However, the performance of ARIMA (5,1,1) was poor (MAPE 34,4 %), and so was the performance of TBATS (MAPE 28,3 %). The results with Prophet, MLP, and ELM were recorded as “not available” since they all produced forecasts with negative val-ues. Unlike ARIMA and TBATS -models, which were introduced nearly five-year data from January 2016 until September 2020, MLP and ELM models were trained with data from January 2018 until the end of September 2020 with 20 repetitions each. This was done due to limited computing power.

Table 9. Forecasting accuracies with daily data during COVID-19

Error measure ARIMA PROPHET TBATS MLP ELM ARIMA PROPHET TBATS MLP ELM ARIMA PROPHET TBATS MLP ELM MAPE .1266 .0636 .0839 .1873 .1714 .1067 .0553 .0755 .2348 .1550 .1105 .0532 .0766 .2995 .1471 MAE 6 502 3 302 4 364 9 500 8 704 5 977 3 137 4 342 13 560 8 584 6 197 2 929 4 401 17 350 8 195 RMSE 7 631 3 926 5 212 11 557 10 850 6 991 3 846 5 139 15 668 10 537 7 397 4 073 5 402 19 322 10 374

MAPE .3439 N/A .2830 N/A N/A - - - - - - - -

-MAE 1 222 N/A 1 049 N/A N/A - - - - - - - -

-RMSE 1 493 N/A 1 286 N/A N/A - - - - - - - -

-MAPE .3580 .3140 .2959 .6524 .3637 - - - - - - - -

-MAE 1 275 1 480 1 079 2 422 1 295 - - - - - - - -

-RMSE 1 543 2 035 1 314 2 689 1 566 - - - - - - - -

-Training set 04/2020 - 09/2020

90 days 180 days 365 days

Training set 01/2016 - 12/2018

Training set 01/2016 - 09/2020

Table 9 combines results for the models mentioned above and the models trained with data only during the crisis. During the crisis (training set 4-9/2020), daily forecasting errors sig-nificantly increased among all methods compared to pre-covid performance, resulting in high MAPEs of 29,6 – 65,2 percent. However, no negative forecast vales were produced when the historical development and the fall in passenger numbers were eliminated. TBATS was the best performer with MAPE 29,6 percent and was followed by PROPHET (MAPE 31,4 percent). The results suggest that neural networks are not performing any better than traditional methods, even when data contains more complex patterns than multiple season-alities and linear trends. However, the results should be considered with caution since the training period during the crisis was somewhat short (185 observations).

5.2 Effects of the pandemic-related variables on forecasting