Methodology - Forecasting Emergency Department arrivals with Facebook Prophet library

This chapter covers detailed information on Facebook Prophet library and the evaluation metrics which are used to determine how the model performs. Whilst the library has a very intuitive application programming interface (API) for Python, which follows the Scikit-learn model API [18], it is necessary to dig deep into it in order to understand how Prophet works under the hood.

3.1 Prophet

Facebook’s Core Data Science team released open source time series forecasting soft-ware called Prophet which has an API available for R and Python [19]. With a fairly low amount of effort, a simple prediction can be made with Prophet, however, if one desires to optimize the model, it is necessary to tune the hyperparameters. The next subsection covers the structure of Prophet model thoroughly.

3.1.1 Structure

As mentioned in Chapter 2, a time series forecasting model can be constructed as stated in equation 2.1. In the original paper, Taylor & Letham [20] proposed a model for Prophet which can be represented in the following equation:

y(t) = g(t) +s(t) +h(t) +ϵ_t, (3.1) where g(t)is the trend function which models a non-periodic change in the time series, s(t)describes periodic changes e.g monthly and yearly seasonality, h(t)describes the effect of holidays and events which may occur in irregular schedules and finally ϵ_tis the error term which represents any quirky changes which are not adapted by the model.

Two different types of trend functions are proposed for the Prophet, nonlinear trend with saturating growth and linear trend with changepoints [20]. The first one can be formulated as:

g(t) = C

1 +exp(−k(t−m)), (3.2)

where C is the carrying capacity, k growth rate, and m an offset parameter. Typically

the C and m are not constant, which is not taken into account in equation 3.2. Taylor

& Letham proposed [20] a model where these are considered, but in order to utilize the model, extra information is needed for the parameters which are not acquired for this thesis. Therefore this thesis will focus on the linear trend which is formulated as follows:

g(t) = (k+a(t)^Tδ)t+ (m+a(t)^Tγ), (3.3) where k is the growth rate, δ has the rate adjustments, m being still the offset param-eter and γ is set as−s_jδ_j in order to make the function continuous. The parameter s_j describes changepoints S at certain timesj = 1, ..., S.

The seasonality model is constructed with the standard Fourier series and thus it can model arbitrary smooth seasonal effects. Hence seasonality can be modeled as:

s(t) =

Here P is the regular period expected to have for the time series, e.g P = 365,25for a year and P = 7for a week. The parameters a_n and b_n need to be estimated in order to fit the seasonality, the estimation is in format β = [a₁, b₁, ..., a_N, b_N]^T. This can be implemented by making a matrix from seasonality vectors for every value of t from the past and the future data. Therefore, the seasonal component is s(t) = X(t)β, when X(t)can be formulated for example:

X(T) = [cos

with weekly seasonality and N = 8. For approximating β in a general model, Taylor &

Letham [20] assumed thatβ∽N(0, σ²), which means thatβis normally distributed and centralized around zero.

Holidays are tricky to forecast because they rarely follow a periodic pattern. However, national holidays are known to occur on certain days, so they must be also considered.

The Prophet model allows the user to manually add columns for the desired holiday or event. Assuming holidays are independent, combining the model with the holidays is quite direct. For every holidayi,D_iis the list of dates for the holiday in the past and in the future. Next, an indicator function is used whether time t is during a holiday. After that, each holiday is assigned with a parameterκ_i which stands for the change in the forecast.

This is done in a similar manner as in seasonality

Z(t) = [1(t ∈D₁), ...,1(t ∈D_L)] (3.6)

and eventually taking

h(t) = Z(t)κ. (3.7)

Same assumption is made forκbeing normally distributed, soκ∽N(0, ν²). [20]

Codewise the Prophet model is based on Stan’s backend. Stan is a cutting-edge platform utilized for statistical modeling and it has high performance for statistical computation [21].

The Facebook’s Core Data Science team released Prophet as open source. The source code can be found from the reference link guiding to GitHub [22].

3.2 Evaluation

When implementing machine learning models, it is important to receive feedback on how the model performed and ultimately compare the results with other implementations. Er-ror functions (loss functions) are a handy tool for this and there are a few common loss functions used in machine learning such as mean squared error (MSE) and mean abso-lute error (MAE) [23]. Root mean square error, MAE and mean absoabso-lute percentage error (MAPE) will be used in this thesis to evaluate Prophet’s performance.

MAPE is a common metric used to measure the outcome of time series forecasting mod-els [24]. MAPE is also often used in real world applications if the quantity to predict is guaranteed to stay above zero, like in this thesis’ data [25]. The loss function itself is calculated in a following way:

wherey_tis the actual value,yˆ_tis the predicted value andnis the total number of samples [23]. While MAPE represents the percentage error, MAE expresses absolute value of the prediction error. MAE can be formulated as:

M AE = 1 but square rooted. MSE resembles MAE with only difference being error term squared.

This causes larger errors to have more weight. RMSE is described in a following manner [26]

The parameters of RMSE represent the same variables as in MAPE and MAE. The results these metrics provide are the better the lower the results are, which means they are

negatively oriented.

In document Forecasting Emergency Department arrivals with Facebook Prophet library (sivua 10-14)