September 3, 2020
# How Good is my Forecast?

## How it works

## Measuring Error

### 1. ME - Mean Error

### 2. MAE - Mean Absolute Error

### 3. RMSE - Root Mean Squared Error

### 4. MAPE - Mean Absolute Percentage Error

#### I Think You’ll Find It’s a Bit More Complicated Than That

If you have enough data then the Forecast Forge addon will estimate how accurate your forecast is likely to be.

We don’t know what will happen in the future so it is impossible to be certain how good or bad your forecast will be. But we can use the same forecasting algorithm to make a forecast for the recent past and then compare how accurate that forecast is against what actually happened.

For example, you might pretend you don’t know what happened between April 2019 and April 2020 (and I think we’d all like to imagine this didn’t happen at all!) and use the data from April 2017 to March 2019 to feed into the forecasting algorithm.

Then you can compare the results of this forecast with the actual data for 2019/20 to see how good the forecasting algorithm is at predicting with your data.

```
You have this data
2018 2019 2020
/----------/-----------/-----
|~~~~~~
And you want to forecast this
Use this data
2018 2019 2020
/----------/-----------/-----
|~~~~~~
To forecast this
```

The Forecast Forge addon shows you four different ways of measuring the error. They are each useful in different circumstances.

Every error metric is based on the daily errors; the difference between the actual value and the forecast value for each day in the forecast.

Take all the error values and find the mean.

This is the simplest error metric but it doesn’t always tell you the full story because positive and negative errors (where the forecast over- and under-estimates) can cancel each other out.

The main thing the Mean Error tells you is whether the forecast tends to overestimate (positive error) or underestimate (negative error).

Take the *absolute value* of the errors (i.e. make them all positive) and then find the mean.

This fixes the problem with Mean Error described above.

Square all the error values, find the mean of this and then take the square root.

This is a **very** commonly used error metric in machine learning. I **strongly** suggest you try to minimise this error when working to improve your forecasts unless you have a very good reason not to.

However, this can be a bit harder to understand than the other error metrics so once you have your model figured out you can report MAE or MAPE to your clients who aren’t elbows deep in forecasting.

Find the error values as a percentage, take the absolute value and then calculate the mean of this.

This is a very useful error metric because it is a percentage; it doesn’t matter what scale the values being forecast are.

For example, imagine I tell you that I’ve made a forecast for average order value (AOV) and that my MAE is `15`

. Is this good or bad?

It is impossible to say without knowing more about the average order value. If it is very high (e.g. over `$200`

) then `15`

is quite good. If it is very low (e.g. `$20`

) then `15`

is very bad!

But if you have a MAPE of `10%`

then you don’t need to know how big or small the AOV is to assess how much of a problem the error might be.

For more detail on running backtests manually or using other error metrics read the Backtesting Forecasts to Estimate Future Accuracy post.

As with just about everything, it’s a bit more complicated than that!

Rather than run just one backtest the addon runs up to five and then averages the results. This is just in case one of the backtest periods is exceptional in some way; running more than one tests makes the estimates more accurate.

```
2018 2019 2020
/----------/-----------/-----
|~~~~~~ Backtest 1
|~~~~~~ Backtest 2
|~~~~~~ Backtest 3
```

This is a process known as *Timeseries Cross Validation*.

Each backtest is known as a *fold*. You can see the number of folds that were used below the error metrics table.