Euro 2016 is here and fans across the competing nations are eagerly following their teams - hoping, dreaming that their country will bring the Henri Delaunay Trophy home. As a football fan, I watch my home team with eager anticipation. But as a data scientist, I also know a thing or two about using data to uncover probabilities of winning.
Indeed, thanks to data freely available on the results of 36,000 international matches since 1873 - their dates, score and venue - including World Cup matches, European championships and other qualification and friendly matches, it is possible to analyse the data and predict the outcome of Euro 2016.
We fed the results from all these recorded international games into the bespoke machine learning algorithm that we developed. The algorithm then learned general statements about the winning probability of every possible match, likelihoods of wins, losses and draws based on past results.
Of course, the fixtures for the group stages of the tournament have already been drawn, but there are 94 billion possible ways the tournament could play out after this. We therefore used the Monte Carlo method to simulate 1 million complete tournaments from beginning through to end, according to the team-to-team game probabilities and the likelihood of each team winning, drawing or losing against their opposition. (The Monte Carlo simulation, or probability simulation, is a technique traditionally used in physics and the financial sector to calculate and understand the impact of risk and uncertainty in financial, project management, cost, and other forecasting models.)
It is never possible to predict with 100 percent certainty who will win this year’s tournament as previous shocks, like Greece’s victory in Euro 2004, have shown that results can sometimes be totally unpredictable. Indeed, as we’ve seen already in this tournament, unexpected events are possible, such as Russia’s goal in the final moments of the England v Russia game. However, it is possible to make statements of probability.
The advanced machine learning algorithm predicted the overall winner to be France, who will enjoy a home advantage throughout the tournament and have a 34.1 per cent chance of winning the competition, 16 years after their last European triumph in 2000. This is more than double the chance of their closest rival for the championship title, Spain, who only have a 13.4 per cent probability of winning the competition in comparison.
The results show that the English team will get off to a strong start, with 56.7 per cent probability that it will finish top of its group and a 93.3 per cent probability of qualifying for the round of 16. Continuing its winning streak, the probability of England reaching the quarter finals is almost two thirds (62.7 per cent) - however, the chances of getting to the semi-finals slips to 36.7 per cent and a final is unlikely at 21.2 per cent. This leaves England with only an 11 per cent chance of winning the competition for the first time in history.
Other predictions made by Blue Yonder’s football forecasting algorithm include:
England finish top of their group: 56.7%
England finish second in their group: 26.8%
England finish third in their group: 12.2%
England finish last in their group: 4.3%
England reach the last 16: 93.3%
England reach the quarterfinals: 62.7%
England reach the semi-finals: 36.7%
England reach the final: 21.2%
England win Euro 2016: 11.0%
If England get to the quarterfinals, then these countries are most likely to be their opponents:
If England reach the semi-finals, these teams are most likely to be their opponents:
If England reach the final, then they are most likely to face:
The top 12 teams with the best chances to win the European Championship:
1. France: 34.1%
2. Spain: 13.4%
3. England: 11.0%
4. Germany: 9.8%
5. Belgium: 4.4%
6. Portugal: 3.8%
7. Poland: 3.2%
8. Russia: 3.0%
9. Croatia: 2.7%
10. Italy: 2.5%
11. Switzerland: 1.6%
12. Romania: 1.5%
The 12 teams with the worst chances to win the European Championship:
13. Republic of Ireland: 1.4%
14. Turkey: 1.4%
15. Ukraine: 1.4%
16. Austria: 1.2%
17. Sweden: 1.2%
18. Czech Republic: 0.88%
19. Hungary: 0.66%
20. Slovakia: 0.5%
21. Albania: 0.14%
22. Wales: 0.14%
23: Iceland: 0.05%
24. Northern Ireland: 0.04%