April 15, 2021

Follow up thoughts after #PPCchat

On Tuesday I was kindly invited to discuss PPC Forecasting as part of the regular #PPCchat Twitter chat. There is a recap put up on the official website.

#PPCchat started just over 10 years ago and, whilst I don’t think I was involved in the very first one I joined the conversation as soon as it was moved to a more convenient time for the UK timezone. It was a very important part of my week for a number of years but I’ve drifted away from the community as my work moved away from hands on PPC management and more towards digital analytics and data science. I was very flattered to be invited to be a guest on the chat and very happy to be able to help a community that has helped me a lot.

The twitter chat format felt a bit frantic so I thought I’d expand on some of the ideas and questions raised here.

The Golden Rules

Before getting into anything too complicated here are what I think the two most important things you can do to get better at forecasting are. They have nothing to do with machine learning or fancy techniques or anything like that; you should be able to apply them regardless of your current process.

Actually care about being right
Keep score; check how good your forecasts are against what actually happened and then try to figure out where you went wrong

If you do only these two things then your forecasting will start to improve. And you don’t even need to pay for a Forecast Forge subscription to do them!

Different types of forecast

To help clarify things and avoid talking at cross-purposes (always a risk on Twitter) I split forecasting up into three overlapping areas:

Forecasting something you’ve never done before (e.g. “we’ve never advertised on LinkedIn; how much will we get from that?”). This is hard and really relies on a lot of hands on experience and marketing expertise for it to work well. Machine Learning of the type that Forecast Forge does isn’t very helpful here - an ML approach that might work is taking data from a lot of people who have done the thing and then trying to figure it out from there
Forecasting doing more (or less) of something you’ve done a bit of in the past. The big one here is “what if we increased/decreased the budget?”” but it might also be things like turning on retargeting, the site going into sale or above the line ad campaigns (e.g. TV). Forecast Forge can help quite a lot with this kind of thing
Estimating trends and changes in everything else - e.g. what is the seasonality for CPC and conversion rate? Is overall search volume in this niche going up or down? Forecast Forge can help here too.

Type one forecasts, where you forecast the impact of something new, are a very interesting challenge; the way to approach them from a Machine Learning angle is to collect data from other people who have done the new thing and then try to figure out which of the other people your client is most similar too. This is basically what people do too when they draw on their experience to make an estimate. Forecast Forge doesn’t store your data, know anything about the sector your business is in, or know exactly what metrics you are forecasting so this is not something it supports (there are a few ways you could “hack” this - ask me if you want to test them).

Forecasting doing more or less of something that you’ve already been doing is really important for paid media; this is how you estimate the impact of increasing or decreasing your budgets which is a super-important and frequently demanded forecast for everyone.

A big part of the challenge here is that as budgets increase you will see diminishing returns; doubling the budget will not give you double the output. This problem has fascinated me for at least 10 years and I will have more to say on it in a future post. Machine learning can be quite helpful here but often it is difficult to have good quality data for training the algorithms. For example, it is much easier to say what might happen if you change the budget from $50k to $100k if there has been a time in the past when you’ve been spending close to $100k.

The third type of forecast, where you make seasonal adjustments and extrapolate already existing trends is where the machine learning approach really shines. Manually adjusting a forecast to take into account a floating holiday like Easter or that things are busy in Q4 just takes a lot longer than letting a machine do it. Of course, it isn’t always simple; the below charts show the changing seasonality around Black Friday/Cyber Monday for a US retailer:

Example from the Stumpy project documentation

The algorithm used by Forecast Forge won’t automatically learn this type of seasonality; it assumes there is much less change in the seasonal effect year to year. You can manually handle this kind of change with regressor columns but it is not a totally automated solution (I think requiring a bit of manual work to improve the forecast at important times of year is a reasonable compromise).

Forecast Forge compared to In-Platform Forecasts

Most of the major platforms and big tools have their own forecasting functionality. Where does Forecast Forge fit in with this?

I’ll talk about the Google Ads Keyword Planner forecasts here because that is the one I’m most familiar with. It has several advantages over what Forecast Forge can offer:

They can forecast using specific auction level data for you and for your competitors.
The forecasting algorithms are created and tweaked by some of the best data scientists in the world.
It is automated and built into the platform

All of these are very useful and I wish I could offer the same with Forecast Forge (especially as that would mean me being world class at this!) but I don’t have access to the data or the talent to make it possible.

But these forecasts are missing one very important element - you and all the expertise and knowledge you have about marketing and your client businesses. The idea that drives Forecast Forge is that you’ll make a better forecast my combining all your knowledge with a bit of machine learning than you will by using the fanciest algorithms.

For example, Google doesn’t know your marketing calendar; they don’t know when you will go into sale, they don’t know when you are running above the line activity, they don’t know when you have new products launching and who knows what they are doing to take into account countries moving in and out of covid lockdowns. These can all be really important things for making a better forecast and for scenario planning.

Another place where Forecast Forge has an edge - and this is true of soooo many products from big tech companies - is that you can talk to someone who knows how it works. You have approximately zero chance of ever speaking to one of the data scientists who works on the Ads Keyword Planner but I am always open to a chat about forecasting and how you can make things better. This might not be quite the same as saying the Forecast Forge model is “interpretable AI” or anything like that, but it is a lot closer to that end of the scale for you than any product from Google.

Comparison with `FORECAST.ETS` in Excel

Excel has a function called FORECAST.ETS that uses a forecasting algorithm called “Exponential Smoothing”. I’m going to talk a bit about the downsides of Exponential Smoothing here but I want to stress at this point that it is a totally legit forecasting algorithm and one of the ones that I always test out if I’m doing a non-Forecast Forge project.

The most important difference between what Forecast Forge can do and what FORECAST.ETS can do is that you can use regressors with Forecast Forge. As I’ve said above I think this is one of the most important things that can help you make a better forecast beyond what the platforms provide.

This isn’t just a weakness with FORECAST.ETS; it is a problem with all exponential smoothing forecasts as Rob Hyndman (a forecasting God) explains here.

Forecast Forge uses what is called a “generalised additive model” (GAM) to make the forecasts and this has a few other advantages over exponential smoothing but these are much more just my opinion and some other data scientists will disagree:

The predictive interval tends to be better calibrated; I have often seen exponential smoothing forecasts predict an unreasonably wide predictive interval.
I much prefer how the trend is calculated in my GAM model than in the Holt-Winter’s model that FORECAST.ETS uses. The GAM has the “idea” (anthropomorphising the algorithm here!) that the trend can change at certain set points.

Comparison with programming languages like R or Python

There is no competition here - R/Python and other programming languages have the potential to make far, far better forecasts than anything you can do with Forecast Forge. So if you are comfortable programming in any of these languages then I suggest you do that!

With Forecast Forge I am trying to make something for people who can think through a forecasting problem like a data scientist but who can’t program or who aren’t comfortable/quick in R or Python. Having a spreadsheet interface can really speed things up and make possible things that would take too long otherwise by just adding a sprinkling of machine learning on top of everything else people can already do in a spreadsheet. Unfortunately the user interface for a spreadsheet is less flexible than programming in a text editor so you lose a bit of flexibility and control; this is the main reason why Forecast Forge isn’t as powerful a system as a programming language.

Even in organisations with a whole team of data scientists there are a lot more people who are comfortable in a spreadsheet than in R/Python. And it is often these people who have the expert knowledge of marketing and business - Forecast Forge can be a good fit here because it can take a bit of load off the data science team and put a bit of machine learning power in the hands of those who can combine it with their other skills.

How does it fit in with the trend towards automation?

One of the reasons I started to move away from PPC work is because I thought that more and more things would be automated by the platforms and move out of my control. I found this gave me a feeling of alienation with my work and I hated that.

But automation can deliver better results - few PPCers can say they were better at managing product level ecommerce queries than Google Shopping is. So those who don’t get on board the (good) automation train will get left behind.

With Forecast Forge I am trying to offer a happy medium where people have 100% control over the inputs to the algorithm and lots of options for tweaking. The algorithm is still a bit of a black box but I’m hoping that there is enough control here to avoid the feeling of alienation whilst still keeping people on board the machine learning train.