Revision

Back to Time series

Exponential smoothing

Simple exponential smoothing

Exponential smoothing is a rule of thumb technique for smoothing time series data using the exponential window function. It assignes exponentially decreasing weights over time. Older data has smaller weights (importance) than more recent data.

Formula

Let \(x_t\) be the serie of data and \(s_t\) our serie of guesses with \(s_0=x_0\). Then:

\[\begin{cases} s_0 = x_0\\ s_t = \alpha x_t + (1 - \alpha) s_{t-1} \; \text{, } t \gt 0 \end{cases}\]

Where:

\(0 \lt \alpha \lt 1\) is the smoothing factor.

A greater \(\alpha\) will give more weigths to recent observations and a smaller \(\alpha\) will give more weights to old observations.

Here the prediction is done at the time where the true observation is known. It is not forward looking.

Prediction

To make future predictions the formula is:

\[s_{t+h} = \alpha \sum_{i=0}^t (1 - \alpha)^{i} x_{t-i}\]

It can be rewritten:

\[s_{t+h}=\alpha x_{t} + (1 - \alpha) s_{t-1}\]

The future prediction does not depend on the horizon \(h\). The prediction in constant in the future.

Estimator of \(\alpha\)

The optimal value for \(\alpha\) is estimated using the least squares method:

\[\alpha = \min_\alpha \sum_{t=1}^T \left(x_t - s_t\right)^2\]

Where:

\(T\) is the number of observed values,
\(s_t\) is defined as above and depends on \(\alpha\).

Double exponential smoothing

Simple exponential smoothing is very easy to calibrate and use but does not take into account the trend of the observations.

Double exponential smoothing takes this into account. Two types of double exponential smoothing exist: Holt-Winters and Brown

Holt-Winters

Let again \(x_t\) be the serie of observations starting at \(t=0\). \(s_t\) is the smoothed value for time \(t\) and \(b_t\) is the trend estimate for time \(t\). The prediction at horizon \(h\) is now written \(\bar{x}_{t+h}\).

Without seasonality

Formula

\[\begin{cases} s_0 = x_0 \\ b_0 = x_1 - x_0 \end{cases}\]

Then, for \(t \gt 0\):

\[\begin{cases} s_t = \alpha x_t + (1 - \alpha)(s_{t-1} + b_{t-1}) \\ b_t = \beta (s_t - s_{t-1}) + (1 - \beta) b_{t-1} \end{cases}\]

Where:

\(0 \leq \alpha \leq 1\) is the data smoothing factor,
\(0 \leq \beta \leq 1\) is the trend smoothing factor.

Prediction

The prediction is:

\[\bar{x}_{t+h} = s_t + h \cdot b_t\]

The prediction at time \(t+h\) is the last know smoothed value \(s_t\) plus the trend \(b_t\) times the horizon.

Estimators of \(\alpha\) and \(\beta\)

he optimal values for \(\alpha\) and \(\beta\) are again estimated using the least squares method:

\[\{\alpha, \beta\} = \min_{\{\alpha, \beta\}} \sum_{t=1}^T \left(x_t - s_t\right)^2\]

Where:

\(T\) is the number of observed values,
\(s_t\) is defined as above and depends on \(\alpha\) and \(\beta\).

Additive seasonality

Let now suppose a seasonality of length \(L\). We now want to estimate a sequence \(c_t\) of seasonal correction factors. We need at least \(2L\) of historical data to initialize the data.

For the additive seasonality the correction factors are factors that we add to the prediction.

Formula

\(\begin{cases} s_0 = x_0 \\ b_0 = \frac{1}{L}\sum_{j=1}^L \frac{x_{L+j} - x_j}{L} \\ c_i = \frac{1}{N}\sum{j=1}^N x_{L \times (j-1) + i} - A_j \; \text{ for } i=1, 2, .., L \end{cases}\)

Where:

\(N\) is the number of complete historical cycle present in the data,
\(A_j=\frac{\sum_{i=1}^L x_{L \times (j-1) + i}} {L} \; \text{ for } j=1, 2, .., L\).

\(b_0\) is the trend computed as the average of the trends taken among two seansons (ie the trend between a point and the same point in the next season). \(c_i\) represents the correction for point \(i\) in the season. It is the average of the historical corrections of the \(i\)-th point in the season. In the additive framework the correction factors are additive.

Then for \(t \gt N \times L\):

\[\begin{cases} s_t = \alpha (x_t - c_{t-L}) + (1 - \alpha)(s_{t-1} + b_{t-1})\\ b_t = \beta (s_t - s_{t-1}) + (1 - \beta) b_{t-1} \\ c_t = \gamma(x_t - s_{t-1} - b_{t-1}) + (1 - \gamma)c_{t-L} \end{cases}\]

Prediction

The prediction is:

\[\bar{x}_{t+h} = s_t + h \cdot b_t + c_{(t-L+1)+(m-1) \mod L}\]

Multiplicative seasonality

Let now suppose a seasonality of length \(L\). We now want to estimate a sequence \(c_t\) of seasonal correction factors. We need at least \(2L\) of historical data to initialize the data.

For the multiplicative seasonality the correction factors are factors that we multiply the prediction.

Formula

\(\begin{cases} s_0 = x_0 \\ b_0 = \frac{1}{L}\sum_{j=1}^L \frac{x_{L+j} - x_j}{L} \\ c_i = \frac{1}{N}\sum{j=1}^N \frac{x_{L \times (j-1) + i}}{A_j} \; \text{ for } i=1, 2, .., L \end{cases}\)

Where:

\(N\) is the number of complete historical cycle present in the data,
\(A_j=\frac{\sum_{i=1}^L x_{L \times (j-1) + i}} {L} \; \text{ for } j=1, 2, .., L\).

\(b_0\) is the trend computed as the average of the trends taken among two seansons (ie the trend between a point and the same point in the next season).

\(c_i\) represents the correction for point \(i\) in the season. It is the average of the historical corrections of the \(i\)-th point in the season. In the multiplicative framework the correction factors are multiplicative.

Then for \(t \gt N \times L\):

\[\begin{cases} s_t = \alpha \frac{x_t}{c_{t-L}} + (1 - \alpha)(s_{t-1} + b_{t-1})\\ b_t = \beta (s_t - s_{t-1}) + (1 - \beta) b_{t-1} \\ c_t = \gamma \frac{x_t}{s_t} + (1 - \gamma)c_{t-L} \end{cases}\]

Prediction

The prediction is:

\[\bar{x}_{t+h} = \left(s_t + h \cdot b_t \right) c_{(t-L+1)+(m-1) \mod L}\]

Brown

Formula

\[\begin{cases} s^{'}_0 = x_0 \\ s^{''}_0 = x_0 \end{cases}\]

Then, for \(t \gt 0\):

\[\begin{cases} s^{'}_t = \alpha x_t + (1 - \alpha) s^{'}_{t-1}\\ s^{''}_t = \alpha s^{'}_t + (1 - \alpha) s^{''}_{t-1} \end{cases}\]

Prediction

\[\bar{x}_{t+h} = a_t + h \cdot \b_t\]

Where:

\(a_t = 2 s^{'}_t - s^{''}_t\),
\[b_t = \frac{\alpha}{1 - \alpha} (s^{'}_t - s^{''}_t)\]

Resources

See: