The Problem of Projecting World Cup Performance

July 13, 2014; Rio de Janeiro, BRAZIL; Germany defender Jerome Boateng (20), midfielder Thomas Muller (13), and defender Per Mertesacker (17) celebrate after winning the championship match of the 2014 World Cup against the Argentina at Maracana Stadium. Germany won 1-0. Mandatory Credit: Tim Groothuis/Witters Sport via USA TODAY Sports

I remember back to when I was watching the Netherlands vs Costa Rica world cup game on CBC, and I was really astonished by the poor commentating. The commentators were obviously pro-Costa Rica, but aside from their bias, I was taken back by their non-existing understanding of statistics and projections. Obviously I cannot expect every commentator to have a good understanding of statistics, but their complete ignorance regarding statistics and projections is almost comical. So today I would like to discuss some of the basics of projecting future performance, the problems of projecting world cup performance and hopefully you can avoid some of the common pitfalls of predicting performance.

To begin, lets look at the fundamental theoretical basis of projections, which is the idea that past performance can be used to predict the future. If someone’s performance has been poor in the past, that allows you to say that this player would perform poorly in the future. For example, back when I was younger, I played as a striker. I couldn’t score against a bunch of 10 year olds, and this is why I did not decide to pursue a career as a footballer. Back in highschool, I was a baseball Pitcher. I can’t really throw, so that’s why I’m not playing in the MLB today. I could not succeed against weak competition, and thus there is absolutely no reason to believe that I can succeed professionally, and that’s why I didn’t get an athletics scholarship.

Intuitively, the idea that past performance can be used to predict the future is very easy to understand. Vincent Kompany was great last season, unless something unexpected happens, it is safe to say that he will continue to be excellent next season.

On a micro level, its reasonable to state that unless something unexpected happens tomorrow, athletic performance would not dramatically change. Players naturally see their skills decline due to age, whereas younger players would develop with training, practice, and mature physically over time. You can read more about projecting future performance here (albeit in a baseball context).

The safe bet says that, on a micro level, things would tend to stay relatively similar from season to season. Usually it is reasonable to assume that players would regress to the mean a bit, or in other words, players better than average would perform slightly worse, and players worse than average would perform a little better than average. However, the most important fact to consider is that performance tends to stay stable, for performance to drastically change, there has to be an explanation. For example, a player would play significantly worse if he broke his leg in the offseason.

On the macro level, projecting future performance becomes much harder. It is much harder to predict the performance of a team relative to projecting the performance of a player. Teams often experience lineup changes, luck plays a larger role, management changes, etc. There are significantly more factors to consider when projecting the performance of a team.

Now consider this, if it is already very difficult to project the performance of a team in a domestic league, it is pretty much impossible to project World Cup performance. Yet pundits do it all the time, but with varying degrees of success. Let us consider World Cup predictions for a second.

On paper, if you randomly guessed the winner of the world cup, you would have a 1/32 chance of getting it correct, which is a little bit over 3%. In truth however, there are only a few good teams with a decent chance of winning. If we were to look at the polling data, the final 4 teams took 72% of the vote on who will win the world cup in a Guardian poll. The top 4 took 58.47% of the vote in a similar NBC poll. Or in other words, it isn’t exactly difficult to look at the lineups and make some educated guesses, and that is pretty much how everyone from the pundits on TV, to the guy in your Sunday league team makes their predictions. With some educated guessing, you have a decent chance of getting it right.

As for predicting the outcomes of individual games, if you were to guess randomly, you have approximately 1/3 chance of getting it right. In the knockout stages, you have a 50% chance of picking the correct team to advance. Guessing the favorites gets you a better chance of getting your predictions correct, but as FiveThirtyEight notes, it is actually an anomaly when the favorites win all the games in the knockout stages, like at this World Cup (before the quarter finals).

So far, so good. Look at the lineups and pick the favorites to win. Everyone understands this, but when people try to invoke data from previous World Cups, they often start to make illogical mistakes.
Often you would hear Pundits say: “Team A are the favorites, but I would pick Team B, because B managed to defeat A X World Cups ago”. You can probably see variations like: “Team A is a much stronger team on paper, but they have never defeated Team B at the World Cup!” These statements are pretty much a staple of punditry, and fans seem to somewhat believe it. But let’s be honest here, predicting World Cup performance based on historical games is almost worse than guessing, you might as well just guess that the favorite will win.

Consider this example: most pundits and projection systems will agree that the USMNT is significantly better than the Ghanaian Black Stars. According to the Fifa rankings, the USMNT is 13th in the world, the Black Stars are 37th. However, there is a certain group of pundits who picked Ghana simply because Ghana knocked the US out of the last 2 world cups, calling Ghana the US’s Achilles Heel.
But if we were to consider these statements logically, it really doesn’t stand to reason. World Cup squads have very high levels of turnover, and the majority of players on the USMNT and the Black Stars did not participate in the 2010 world cup, much less the 2006 world cup. There is absolutely no logical reason why Ghana should always beat the US, and even if Ghana has some sort of superpower that ensures their victory against the United States, the sample size is too small for anyone to come to come to the conclusion that Ghana will always beat the US at the World Cup.

Using results from previous World Cups to predict the results of matches at this World Cup is a completely futile effort. Please use this convenient chart to help you figure out the predictive value of an event at a previous world cup:

The CBC commentators that made me angry in the Netherlands vs Costa Rica game were obviously biased, rooting for the Costa Ricans. Towards the end of the 90 minutes, they were convinced that they Costa Ricans would beat the Dutch. After all, the Dutch “never did well in extra time”. Let’s look at the validity of their claims shall we?

The commentators were knowledgeable, I will give them that. They did their homework, and they knew about the historical performance of the Dutch at the World Cup. When the game went into extra time, the commentators quickly decided that Costa Rica would inevitably triumph, after all, the Dutch historically performed horribly in extra time at the World Cup!

They pulled out examples of Dutch failures in their heartbreaking past, but do they matter at all? Does the fact that performed poorly in extra time have any predictive value on whether they can win in extra time in 2014? Of course not!

When discussing Dutch failures in overtime, there are a few examples. For instance, did the Dutch not lose the final in 1978 against Argentina in overtime? But besides being interesting trivia, does it have any predictive value on this World Cup? Not a single Dutch player from 1978 is still playing for the national team. To point at 1978 to say that the Dutch are bad in extra time is completely nonsense. I might as well say that the Dutch will win because DC United beat Toronto FC on July 5th, these two results have approximately the same predictive value on the performance of the current Dutch national team. You have a few data points, but their predictive value range falls somewhere between “absolutely useless” and “mostly useless”.

When the game went to penalties, the commentators seemed to be even more confident that the Costa Ricans would advance. After all, the Netherlands have won exactly 0% of their penalty shootouts in the World Cup! The “statistics suggest” that the Dutch will fail. Look, STATISTICS! MATH! That’s gotta mean something right?

The 0% success rate is completely worthless at predicting how well the Dutch will do in penalty shootouts in this world cup. After all, that number is derived from a sample size of a single game! ONE GAME! We can argue forever on how many data points you need before this number becomes meaningful (I say, for national teams, it’s impossible for penalty shootout success rates to become meaningful), but we can all agree that a single game is too small of a sample. Not a single player on this year’s team partook in the 1998 World Cup in the first place, and don’t forget, penalty shootouts contain a high degree of luck as well.

With the high level of turnover, and the microscopic samples sizes in the World Cup, here is a good rule of thumb whenever someone quotes results from previous World Cups: past performance has pretty much no predictive value for the future. And this applies to the Euros, Olympics, and other non-yearly tournaments.

Besides teams, pundits always love to use previous World Cup data to predict the performance of an individual player. For example: “Player X is bad in big games, he didn’t score at the last World Cup!” or the opposite, “Player Y might not be the best player, but he is a clutch player who steps it up in important games, just look at his World Cup performance!” Using data from World Cups to project future performance is slightly better for individuals than it is for teams, but its predictive value is still extremely low.

Remember this: 50% of the players in the World Cup go home after 3 games. 75% go home after 4. Is 3 to 4 games really a big enough sample size to use to evaluate a player? Lots of great players might do poorly in 3 games, lots of poor players might do well in 3 games. This is why I do not place a high amount of value on World Cup performance when it comes to evaluating a player’s actual skill. After all, Ronaldo scored 0 goals this World Cup, yet he is arguably the best footballer out there.

Besides sample size, World Cup performance cannot be used to effectively project performance outside of the World Cup due to the fact that there are really too many variables that may or may not have an effect on player performance.

Is a player playing poorly because he is simply unskilled, or is it due to the fact that he barely knows his teammates on the national team? Or is it due to the fact that he does not like Brazil’s climate? There are too many factors that we have to compensate for before World Cup data can be used to project performance outside of the World Cup, and combined with a tiny sample size, World Cup performances really shouldn’t be used to project performance outside of it.

Is a player’s performance at previous World Cups predictive of how he would perform at future World Cups? For instance, pundits always say: “Well I know that this guy is great, but he performed horribly at the previous World Cup”, and use it to predict that a player will either perform great or horrible during the tournament. I disagree with this line of thinking, and I believe that a player’s performance at previous World Cups have little relationship with how he will perform in future tournaments.

After all, we again run into the sample size problem. With a sample size of 3 – 7 games, we really don’t have much data to look at. More importantly, how a player last performed at the World Cup is how he performed 4 years ago. 4 years is a very long time, and the nature of football is such that players can and often do change quite a bit in that time frame, especially if they are at the two end of the age spectrum. Combined with other factors like teammate turnover, and the difficulties of playing in an unfamiliar country, World Cup performance at previous events really cannot be used to predict how well a player would perform in the future.

There are two main branches of statistics, descriptive and inferential. Descriptive statistics is the branch of statistics where events are described by statistics, for example: “this player scored 10 goals last season”. Inferential statistics is where you try to reach conclusions that extend beyond the current data, and is the branch of statistics that we use to project athletic performance. One of the most important things to consider in inferential statistics is sample size, and getting a large sample size is simply impossible in the World Cup.

And thus, we have an interesting situation. The FIFA World Cup is soccer’s biggest trophy, yet it almost exists in a vacuum of sorts. We cannot use a team’s performance at previous World Cups to project how they would perform in future tournaments, nor can we use player performance from the World Cup to predict how he would perform outside of the World Cup. So next time you hear some pundit claim that he can predict how a team would perform at the World Cup due to something that happened 20 years ago, just have a laugh and enjoy the game.

Update: CBC strikes again! The announcers kept talking about how South American teams won every time in South America, and how Argentina never conceded a goal in extra time. They really don’t understand the concepts of sample size or team turnover it seems.