If you’ve ever played in an amateur football team, you might have heard your manager talk about the 10 minutes after a goal being critical, as this is the most likely time for another goal to be scored. I decided to investigate whether there was any basis in professional football results for the claim that goals come in pairs. Does the scoring of a goal make another goal in the next 5 or 10 minutes more likely?
Studying Euro 2016
To answer this question, I wrote a script using Python to scrape Euro 2016 goal times from a popular news website’s match reports so that I could analyse the data. Once I knew the minute of every goal scored in the tournament, I could compare, for each goal, the number of goals actually scored in the next 5 minutes with the number we would have expected for that time period.
Because goal frequency is not evenly distributed throughout the 90 minutes of play (later minutes of the match are typically higher scoring), the expected number of goals for that period was calculated by taking the average number of goals scored in that period across all games in the tournament, to avoid skewing the results due to the large number of late goals. To illustrate this point, the following graph shows the significant difference in the number of expected goals between 5 minute periods in the first half and 5 minute periods in the second half across 6 European leagues. Each bar in the graph reflects the average number of goals in the 5 minutes after the time indicated on the x-axis:
Figure 1: Average number of goals by 5 minute period across 6 European leagues in the 2015-2016 season
Having calculated this expected number of goals, I could then take the average over all goals to see if the number of goals actually scored in the minutes after a goal was higher than expected, producing the results below:
Figure 2: Actual vs. expected number of goals in the 10 minutes after a goal had been scored at Euro 2016
By taking the average across the 51 games in the competition, a lower than otherwise expected number of goals had been scored both in the next 5 minutes, and in the 5 minutes after that, seeming to refute the hypothesis that goals come in pairs.
Expanding the sample size
Of course, 51 games is a small sample size. What if we look at a bigger data set, say all matches from the 6 top football leagues in Europe in the 2015-2016 season, a total of 2132 games?
Generalising the web scraping script and repeating the analysis, I produced the following table:
Figure 3: Actual vs. expected number of goals in the 10 minutes after a goal had been scored in European leagues in the 2015-2016 season
Here a different picture emerges While additional goals are significantly less likely in the first 5 minutes after a goal has been scored, the next 5 minutes after that are prime time for more goals.
This is not too surprising given the typical minute or two that it normally takes for play to recommence after a goal has been scoredIndeed if we look at the data in minute-by-minute detail, we see that lull clearly represented in the first minute after the goal:
Figure 4: Percentage difference between actual vs. expected number of goals by number of minutes after a goal had been scored across 6 European leagues in the 2015-2016 season
This graph tells the story at a more granular level: goals are much less likely in the first two minutes after a goal, gradually becoming more likely until around the 8th minute, before settling back down towards a random pattern centred around 0.
A reliable conclusion?
If you’re reading this analysis with a critical eye, you might question whether some games were always prone to lots of goals. If this were the case, the majority of the league’s goals would have occurred in these inherently goal-prone games. After each goal, another goal in the next 10 minutes would have been more likely than in an average game not necessarily because of the goal that was just scored, but because the game was goal-prone in the first place.
For example, if two very attacking teams play each other, we might expect more goals in a given 10 minute period than in an average league game. If a goal is scored, we might still expect more goals in the next 10 minutes than in an average game, but do we expect more goals than if the first goal had not been scored? The analysis above does not answer this question.
Next week we will address this challenge by calculating the expected number of goals for each team based on home advantage and relative attacking/ defensive strength of the two teams, and see whether the hypothesis that goals come in pairs still holds. In the meantime though, we can be confident that a second goal is more likely to occur 5 to 10 minutes after the first goal than immediately after it.
So if you’re hoping to take advantage of the goal celebrations to go and grab a bite from the kitchen, don’t worry too much about missing a goal if you’ll be back within a minute or two. But whatever you do, make sure you’re no more than 5 minutes.
This article was written by Alex Bleakley from CapGemini: Business Analytics (UK) and was legally licensed through the NewsCred publisher network.