To make much sense of the statistics available for football we need to have an understanding of their context so I am planning on starting off simple by looking at baselines for various events and statistics while I build up the information required to start a mathematical model.
While most football analytics seems to focus heavily on goals, I am going to start off with defending and the all important clean sheet. Clean sheets have been fairly consistent throughout the English Premier League’s (EPL) history, occurring in around 27% of matches between 1993 and 2011 (Figure 1). The data shows some variability around the mean with perhaps the slightest hint of an upwards trend, but in general the total number of clean sheets per season has remained constant.
Figure 1: Total English Premier League Clean Sheets
If we split the data by home and away then we can immediately see a significant difference (Figure 2; p>0.001). On average, the home team will keep a clean sheet 33% of the time while the away team will only manage it in 22% of their matches. Interestingly, both sets of data appear to follow broadly similar patterns with peaks and troughs occurring in the same years. I hope to explore this in more detail in the future.
Figure 2: Clean Sheets Home and Away
Clean sheets are valuable commodities as they guarantee you a minimum of one point. As the cliche goes, if you keep a clean sheet you cannot lose. Looking back over the EPL’s history shows that a clean sheet at home is actually worth 2.1 points on average. This means that over the course of a season obtaining a clean sheet in 33% of matches would be expected to generate 13.2 points. Away from home a clean sheet is of lower value, generating just 1.8 points each. Over the course of a season this would therefore bring in an additional 7.5 points.
We can use these baselines to examine how teams are performing in terms of clean sheets home and away. Figure three shows the proportion of matches in which each team in the EPL obtained clean sheets for the 2011-2012 season. The teams in the upper right quadrant all acheived an above average number of clean sheets both home and away. In comparison, West Brom’s defence performed very well at home yet they struggled to obtain clean sheets away from the Hawthorns. Liverpool were the opposite of West Brom, keeping clean sheets away from Anfield but struggling at home. Bolton, Blackburn and Wolves all generated very low numbers of clean sheets home and away and were all relegated from the EPL. Norwich are an interesting exception as they possessed the worst away record for clean sheets yet managed to finish in a respectable 12th position last year.
Figure 3: Proportion of Matches With Clean Sheets Home and Away
If we carry out linear regression on 2011-2012’s data (Figure 4) we can see the correlation between the number of clean sheets a team kept over the season and their final league position. The r2 value of 0.72 for the regression shows that the two are strongly correlated with each other so any team not keeping clean sheets could be expected to finish lower down the league table. This does not bode well for current champions Manchester City, who have conceded goals in seven of their eight EPL matches this season.
Figure 4: Correlation of Final League Position to Number of Clean Sheets 2011/2012
Submit your comments below, and feel free to format them using MarkDown if you want. Comments typically take upto 24 hours to appear on the site and be answered so please be patient.