On quantifying passing skill

Quantifying passing skill has been a topic that has gained greater attention over the past 18 months in public analytics circles, with Paul Riley,  StatsBomb and Played off the Park regularly publishing insights from their passing models. I talked a little about my own model last season but only published results on how teams disrupted their opponents passing. I thought delving into the nuts and bolts of the model plus reporting some player-centric results would be a good place to start as I plan to write more on passing over the next few months.

Firstly, the model quantifies the difficulty of an open-play pass based on its start and end location, as well as whether it was with the foot or head. So for example, relatively short backward passes by a centre back to their goalkeeper are completed close to 100% of the time, whereas medium-range forward passes from out-wide into the centre of the penalty area have pass completion rates of around 20%.

The data used for the model is split into training and testing sets to prevent over-fitting. The Random Forest-based model does a pretty good job of representing the different components that drive pass difficulty, some of which are illustrated in the figure below (also see the appendix here for some further diagnostics).

xP_val

Comparison between expected pass completion rates from two different passing models and actual pass completion rates based on the start and end location of an open-play pass. The horizontal dimension is orientated from left-to-right, with zero designating the centre of the pitch. The dashed lines in the vertical dimension plots show the location of the edge of each penalty area. Data via Opta.

One slight wrinkle with the model is that it has trouble with very short passes of less than approximately 5 yards due to the way the data is collected; if a player attempts a pass and an opponent in his immediate vicinity blocks it, then the pass is unsuccessful and makes it looks like such passes are really hard, even though the player was actually attempting a much longer pass. Neil Charles reported something similar in his OptaPro Forum presentation in 2017. For the rest of the analysis, such passes are excluded.

None shall pass

That gets some of the under-the-hood stuff out of the way, so let’s take a look at ways of quantifying passing ‘skill’.

Similar to the concept of expected goals, the passing model provides a numerical likelihood of a given pass being completed by an average player; deviations from this expectation in reality may point to players with greater or less ‘skill’ at passing. The analogous concept from expected goals would be comparing the number of goals scored versus expectation and interpreting this as ‘finishing skill‘ or lack there of. However, when it comes to goal-scoring, such interpretations tend to be very uncertain due to significant sample size issues because shots and goals are relatively infrequent occurrences. This is less of a concern when it comes to passing though, as many players will often attempt more passes in a season than they would take shots in their entire career.

Another basic output of such models is an indication of how adventurous a player is in their passing – are they playing lots of simple sideways passes or are they regularly attempting defense-splitting passes?

The figure below gives a broad overview of these concepts for out-field players from the top-five leagues (England, France, Germany, Italy and Spain) over the past two seasons. Only passes with the feet are included in the analysis.

dxP_avg_xP_scatter

Passing ‘skill’ compared to pass difficulty for outfield players from the past two seasons in the big-five leagues, with each data point representing a player who played more than 3420 minutes (equivalent to 38 matches) over the period. The dashed lines indicate the average values across each position. Foot-passes only. Data from Opta.

One of the things that is clear when examining the data is that pulling things apart by position is important as the model misses some contextual factors and player roles obviously vary a huge amount depending on their position. The points in the figure are coloured according to basic position profiles (I could be more nuanced here but I’ll keep it simpler for now), with the dashed lines showing the averages for each position.

In terms of pass difficulty, midfielders attempt the easiest passes with an average expected completion of 83.2%. Forwards (81.6%) attempt slightly easier passes than defenders (81.4%), which makes sense to me when compared to midfielders, as the former are often going for tough passes in the final third, while the latter are playing more long passes and crosses.

Looking at passing skill is interesting, as it suggest that the average defender is actually more skilled than the average midfielder?!? While the modern game requires defenders to be adept in possession, I’m unconvinced that their passing skills outstrip midfielders. What I suspect is happening is that passes by defenders are being rated as slightly harder than they are in reality due to the model not knowing about defensive pressure, which on average will be less for defenders than midfielders.

Forwards are rated worst in terms of passing skill, which is probably again a function of the lack of defensive pressure included as a variable, as well as other skills being more-valued for forwards than passing e.g. goal-scoring, dribbling, aerial-ability.

Pass muster

Now we’ve got all that out of the way, here are some lists separated by position. I don’t watch anywhere near as much football as I once did, so really can’t comment on quite a few of these and am open to feedback.

Note the differences between the players on these top-ten lists are tiny, so the order is pretty arbitrary and there are lots of other players that the model thinks are great passers who just missed the cut.

First-up, defenders: *shrugs*.

In terms of how I would frame this, I wouldn’t say ‘Faouzi Ghoulam is the best passer out of defenders in the big-five leagues’. Instead I would go for something along the lines of ‘Faouzi Ghoulam’s passing stands out and he is among the best left-backs according to the model’. The latter is more consistent with how football is talked about in a ‘normal’ environment, while also being a more faithful presentation of the model.

Looking at the whole list, there is quite a range of pass difficulty, with full-backs tending to play more difficult passes (passes into the final third, crosses into the penalty area) and the model clearly rates good-crossers like Ghoulam, Baines and Valencia. Obviously that is a very different skill-set to what you would look for in a centre back, so filtering the data more finely is an obvious next step.

Defenders (* denotes harder than average passes)

Name Team xP rating Pass difficulty
Faouzi Ghoulam Napoli 1.06 80.3*
Leighton Baines Everton 1.06 76.5*
Stefan Radu Lazio 1.06 82.1
Thiago Silva PSG 1.06 91.0
Benjamin Hübner Hoffenheim 1.05 84.4
Mats Hummels Bayern Munich 1.05 86.0
Kevin Vogt Hoffenheim 1.05 87.4
César Azpilicueta Chelsea 1.05 83.4
Kalidou Koulibaly Napoli 1.05 87.8
Antonio Valencia Manchester United 1.05 80.0*

On to midfielders: I think this looks pretty reasonable with some well-known gifted passers making up the list, although I’m a little dubious about Dembélé and Fernandinho being quite this high up. Iwobi is an interesting one and will keep James Yorke happy.

Fàbregas stands-out due to his pass difficulty being well-below average without having a cross-heavy profile – nobody gets near him for the volume of difficult passes he completes.

Midfielders (* denotes harder than average passes)

Name Team xP rating Pass difficulty
Cesc Fàbregas Chelsea 1.06 79.8*
Toni Kroos Real Madrid 1.06 88.1
Luka Modric Real Madrid 1.06 85.9
Arjen Robben Bayern Munich 1.05 79.6*
Jorginho Napoli 1.05 86.8
Mousa Dembélé Tottenham Hotspur 1.05 89.9
Fernandinho Manchester City 1.05 87.2
Marco Verratti PSG 1.05 87.3
Alex Iwobi Arsenal 1.05 84.9
Juan Mata Manchester United 1.05 84.5

Finally, forwards AKA ‘phew, it thinks Messi is amazing’.

Özil is the highest-rated player across the dataset, which is driven by his ability to retain possession and create in the final third. Like Fàbregas above, Messi stands out for the difficulty of the passes he attempts and that he is operating in the congested central and half-spaces in the final third, where mere mortals (and the model) tend to struggle.

In terms of surprising names: Alejandro Gomez appears to be very good at crossing, while City’s meep-meep wide forwards being so far up the list makes we wonder about team-effects.

Also, I miss Philippe Coutinho.

Forwards (* denotes harder than average passes)

Name Team xP rating Pass difficulty
Mesut Özil Arsenal 1.07 82.9
Eden Hazard Chelsea 1.05 81.9
Lionel Messi Barcelona 1.05 79.4*
Philippe Coutinho Liverpool 1.04 80.6*
Paulo Dybala Juventus 1.03 84.8
Alejandro Gomez Atalanta 1.03 74.4*
Raheem Sterling Manchester City 1.03 81.6*
Leroy Sané Manchester City 1.03 81.9
Lorenzo Insigne Napoli 1.03 84.3
Diego Perotti Roma 1.02 78.4*

Finally, the answer to what everyone really wants to know is, who is the worst passer? Step-forward Mario Gómez – I guess he made the right call when he pitched his tent in the heart of the penalty area.

Pass it on

While this kind of analysis can’t replace detailed video and live scouting for an individual, I think it can provide a lot of value. Traditional methods can’t watch every pass by every player across a league but data like this can. However, there is certaintly a lot of room for improvement and further analysis.

A few things I particularly want to work on are:

  • Currently there is no information in the model about the type of attacking move that is taking place, which could clearly influence pass difficulty e.g. a pass during a counter-attacking situation or one within a long passing-chain with much slower build-up. Even if you didn’t include such parameters in the model, it would be a nice means of filtering different pass situations.
  • Another element in terms of context is attempting a pass after a dribble, especially given some of the ratings above e.g. Hazard and Dembélé. I can envisage the model somewhat conflates the ability to create space through dribbling and passing skill (although this isn’t necessarily a bad thing depending on what you want to assess).
  • Average difficulty is a bit of a blunt metric and hides a lot of information. Developing this area should be a priority for more detailed analysis as I think building a profile of a player’s passing tendencies would be a powerful tool.
  • You’ll have probably noticed the absence of goalkeepers in the results above. I’ve left them alone for now as the analysis tends to assign very high skill levels to some goalkeepers, especially those attempting lots of long passes. My suspicion is that long balls up-field that are successfully headed by a goalkeeper’s team-mate are receiving a bit too much credit i.e. yes the pass was ‘successful’ but that doesn’t always mean that possession was retained after the initial header. That isn’t necessarily the fault of the goalkeeper, who is generally adhering to the tactics of their team and the match situation but I’m not sure it really reflects what we envisage as passing ‘skill’ when it comes to goalkeepers. Discriminating between passes to feet and aerial balls would be a useful addition to the analysis here.
  • Using minutes as the cut-off for the skill ratings leaves a lot of information on the table. The best and worst passers can be pretty reliably separated after just a few hundred passes e.g. Ruben Loftus-Cheek shows up as an excellent passer after just ~2000 minutes in the Premier League. Being able to quickly assess young players and new signings should be possible. Taking into account the number of passes a player makes should also be used to assess the uncertainty in the ratings.

I’ve gone on enough about this, so I’ll finish by saying that any feedback on the analysis and ratings is welcome. To facilitate that, I’ve built a Tableau dashboard that you can mess around with that is available from here and you can find the raw data here.

Time to pass and move.

Advertisement

Is scoring ability maintained from season to season?

With the football season now over across the major European leagues, analysis and discussion turns to reflection of the who, what and why of the past year. With the transfer window soon to do whatever the opposite of slam shut is, thoughts also turn to how such reflections might inform potential transfer acquisitions. As outlined by Gabriele Marcotti today in the Wall Street Journal, strikers are still the centre of attention when it comes to transfers:

The game’s obsession with centerforwards is not new. After all, it’s the glamour role. Little kids generally dream of being the guy banging in the goals, not the one keeping them out.

On the football analytics front, there has been a lot of discussion surrounding the relative merits of various forward players, with an increasing focus on their goal scoring efficiency (or shot conversion rate) and where players are shooting from. There has been a lot of great work produced but a very simple question has been nagging away at me:

Does being ‘good’ one year suggest that you’ll be ‘good’ next year?

We can all point to examples of forwards shining brightly for a short period during which they plunder a large number of goals, only to then fade away as regression to their (much lower) mean skill level ensues. With this in mind, let’s take a look at some data.

Scoring proficiency

I’ve put together data on players over the past two seasons who have scored at least 10 goals during a single season in the top division in either England, Spain, Germany or Italy from WhoScored. Choosing 10 goals is basically arbitrary but I wanted a reasonable number of goals so that calculated conversion rates didn’t oscillate too wildly and 10 seems like a good target for your budding goalscorer. So for example, Gareth Bale is included as he scored 21 in 2012/13 and 9 goals in 2011/12 but Nikica Jelavić isn’t as he didn’t pass 10 league goals in either season. Collecting the data is painful so a line had to be drawn somewhere. I could have based it on shots per game but that is prone to the wild shooting of the likes of Adel Taarabt and you end up with big outliers. If a player was transferred to or from a league within the WhoScored database (so including France), I retained the player for analysis but if they left the ‘Big 5’ then they were booted out.

In the end I ended up with 115 players who had scored at least 10 league goals in one of the past two seasons. Only 43 players managed to score 10 league goals in both 2011/12 and 2012/13, with only 6 players not named Lionel Messi or Cristiano Ronaldo able to score 20 or more in both seasons. Below is how they match up when comparing their shot conversion, where their goals are divided by their total shots, across both seasons. The conversion rates are based on all goals and all shots, ideally you would take out penalties but that takes time to collate and I doubt it will make much difference to the conclusions.

Comparison between shot conversion rates for players in 2011/12 and 2012/13. Click on the image or here for a larger interactive version.

If we look at the whole dataset, we get a very weak relationship between shot conversion in 2013/12 relative to shot conversion in 2011/12. The R^2 here is 0.11, which suggests that shot conversion by an individual player shows 67% regression to the mean from one season to the next. The upshot of this is that shot conversion above or below the mean is around two-thirds due to luck and one-third due to skill. Without filtering the data any further, this would suggest that predicting how a player will convert their chances next season based on the last will be very difficult.

A potential issue here is the sample size for the number of shots taken by an individual in a season. Dimitar Berbatov’s conversion rate of 44% in 2011/12 is for only 16 shots; he’s good but not that good. If we filter for the number of shots, we can take out some of the outliers and hopefully retain a representative sample. Up to 50 shots, we’re still seeing a 65% regression to the mean and we’ve reduced our sample to 72 players. It is only when we get up to 70 shots and down to 44 players that we see a close to even split between ‘luck’ and ‘skill’ (54% regression to the mean). The problem here is that we’re in danger of ‘over-fitting’ as we rapidly reduce our sample size. If you are happy with a sample of 18 players, then you need to see around 90 shots per season to able to attribute 80% of shot conversion to ‘skill’.

Born again

So where does that leave us? Perhaps unsurprisingly, the results here for players are similar to what James Grayson found at the team level, with a 61% regression to the mean from season to season. Mark Taylor found that around 45 shots was where skill overtook luck for assessing goal scoring, so a little lower than what I found above although I suspect this is due to Mark’s work being based on a larger sample over 3 season in the Premier League.

The above also points to the ongoing importance of sample size when judging players, although I’d want to do some more work on this before being too definitive. Judgements on around half a season of shots appears rather unwise and is about as good as flipping a coin. Really you want around a season for a fuller judgement and even then you might be a little wary of spending too much cash. For something approaching a guarantee, you want some heavy shooting across two seasons, which allied with a good conversion rate can bring you over 20 league goals in a season. I guess that is why the likes of Van Persie, Falcao, Lewandowski, Cavani and Ibrahimovic go for such hefty transfer fees.

Is playing style important?

I’ve previously looked at whether different playing styles can be assessed using seasonal data for the 2011/12 season. The piece concentrated on whether it was possible to separate different playing styles using a method called Principal Component Analysis (PCA). At a broad level, it was possible to separate teams between those that were proactive and reactive with the ball (Principal Component 1) and those that attempted to regain the ball more quickly when out of possession (Principal Component 2). What I didn’t touch upon was whether such features were potentially more successful than others…

Below is the relationship between points won during the 2011/12 season and the proactive/reactive principal component. The relationship between these variables suggests that more proactive teams, that tend to control the game in terms of possession and shots, are more successful. However, the converse could also be true to an extent in that successful teams might have more of the ball and thus have more shots and concede fewer. Either way, the relationship here is relatively strong, with an R2 value of 0.61.

Blah.

Relationship between number of points won in the 2011/12 season with principal component 1, which relates to the proactive or reactive nature of a team. More proactive teams are to the right of the horizontal axis, while more reactive teams are to the left of the horizontal axis. The data is based on the teams in the top division in Germany, England, Spain, France and Italy from WhoScored. The black line is the linear trend between the two variables. A larger interactive version of the plot is available either by clicking on the graph or clicking here.

Looking at the second principal component, there is basically no relationship at all with points won last season, with an R2 value of a whopping 0.0012. The trend line on the graph is about as flat as a pint of lager in a chain sports bar. There is a hint of a trend when looking at the English and French leagues individually but the sample sizes are small here, so I wouldn’t get too excited yet.

Playing style is important then?

It’s always tempting when looking at scatter plots with nice trend lines and reasonable R2 values to reach very steadfast conclusions without considering the data in more detail. This is likely an issue here as one of the major drivers of the ‘proactive/reactive’ principal component is the number of shots attempted and conceded by a team, which is often summarised as a differential or ratio. James Grayson has shown many times how Total Shots Ratio (TSR, the ratio of total shots for/(total shots for+total shots against)) is related to the skill of a football team and it’s ability to turn that control of a game into success over a season. That certainly appears to play a roll here, as this graph demonstrates, as the relationship between points and TSR yields an R2 value of 0.59. For comparison, the relationship between points and short passes per game yields an R2 value of 0.52. As one would expect based on the PCA results and this previous analysis, TSR and short passes per game are correlated also (R2 = 0.58).

Circular argument

As ever, it is difficult to pin down cause and effect when assessing data. This is particularly true in football when using seasonal averaged statistics as score effects likely play a significant role here in determining the final totals and relationships. Furthermore, the input data for the PCA is quite limited and would be improved with more context. However, the analysis does hint at more proactive styles of play being more successful; it is a challenge to ascribe how much of this is cause and how much is effect.

Danny Blanchflower summed up his footballing philosophy with this quote:

The great fallacy is that the game is first and last about winning. It is nothing of the kind. The game is about glory, it is about doing things in style and with a flourish, about going out and beating the other lot, not waiting for them to die of boredom.

The question is, is the glory defined by the style or does the style define the glory?

Assessing team playing styles

The perceived playing style of a football team is a much debated topic with conversations often revolving around whether a particular style is “good/bad” or “entertaining/boring”. Such perceptions are usually based upon subjective criteria and personal opinions. The question is whether the playing style of a team can be assessed using data to categorise and compare different teams.

WhoScored report several variables (e.g. data on passing, shooting, tackling) for the teams in the top league in England, Spain, Italy, Germany and France. I’ve collated these variables for last season (2011/12) in order to examine whether they can be used to assess the playing style of these sides. In total there are 15 variables, which are somewhat limited in scope but should serve as a starting point for such an analysis. Goals scored or conceded are not included as the interest here is how teams actually play, rather than how it necessarily translates into goals. The first step is to combine the data in some form in order to simplify their interpretation.

Principal Component Analysis

One method for exploring datasets with multiple variables is Principal Component Analysis (PCA), which is a mathematical technique that attempts to find the most common patterns within a dataset. Such patterns are known as ‘principal components’, which describe a certain amount of the variability in the overall dataset. These principal components are numbered according to the amount of variance in the dataset that they account for. Generally this means that only the first few principal components are examined as they account for the greatest percentage variance in the dataset. Furthermore, the object is to simplify the dataset so examining a large number of principal components would somewhat negate the point of the analysis.

The video below gives a good explanation of how PCA might be applied to an everyday object.

Below is a graph showing the first and second principal components plotted against each other. Each data point represents a single team from each of the top leagues in England, Spain, Italy, Germany and Italy. The question though is what do each of these principal components represent and what can they tell us about the football teams included in the analysis?

Principal component analysis of all teams in the top division in England, Spain, Italy, Germany and France. Input variables are taken from WhoScored.com for the 2011/12 season.

The first principal component accounts for 37% of the variance in the dataset, which means that just over a third of the spread in the data is described by this component. This component is represented predominantly by data relating to shooting and passing, which can be seen in the graph below. Passing accuracy and the average number of short passes attempted per game are both strongly negatively-correlated (r=-0.93 for both) with this principal component, which suggests that teams positioned closer to the bottom of the graph retain possession more and attempt more short passes; unsurprisingly Barcelona are at the extreme end here. Total shots per game and total shots on target per game are also strongly negatively-correlated (r=-0.88 for both) with the first principal component. Attempted through-balls per game are also negatively correlated (r=-0.62). In contrast, total shots conceded per game and total aerial duels won per game are positively-correlated (r=0.65 & 0.59 respectively). So in summary, teams towards the top of the graph typically concede more shots and win more aerial duels, while as you move down the graph, teams attempt more short passes with greater accuracy and have more attempts at goal.

The first principal component is reminiscent of a relationship that I’ve written about previously, where the ratio of shots attempted:conceded was well correlated with the number of short passes per game. This could be interpreted as a measure of how “proactive” a team is with the ball in terms of passing and how this transfers to a large number of shots on goal, while also conceding fewer shots. Such teams tend to have a greater passing accuracy also. These teams tend to control the game in terms of possession and shots.

The second principal component accounts for a further 18% of the variance in the dataset [by convention the principal components are numbered according to the amount of variance described]. This component is positively correlated with tackles (0.77), interceptions (0.52), fouls won (0.68), fouls conceded (0.74), attempted dribbles (0.59) and offsides won (0.63). In essence, teams further to the right of the graph attempt more tackles, interceptions and dribbles which unsurprisingly leads to more fouls taking place during their matches.

The second principal component appears to relate to changes in possession or possession duels, although the data only relates to attempted tackles, so there isn’t any information on how successful these are and whether possession is retained. Without more detail, it’s difficult to sum up what this component represents but we can describe the characteristics of teams and leagues in relation to this component.

Correlation score graph for the principal component analysis. PS stands for Pass Success.

The first and second components together account for 55% of the variance in the dataset. Adding more and more components to the solution would drive this figure upwards but in ever diminishing amounts e.g. the third component accounts for 8% and the fourth accounts for 7%. For simplicity and due to the further components adding little further interpretative value, the analysis is limited to just the first two components.

Assessing team playing styles

So what do these principal components mean and how can we use them to interpret team styles of play? Putting all of the above together, we can see that there are significant differences between teams within single leagues and when comparing all five as a whole.

Within the English league, there is a distinct separation between more proactive sides (Liverpool, Spurs, Chelsea, Manchester United, Arsenal and Manchester City) and the rest of the league. Swansea are somewhat atypical, falling between the more reactive English teams and the proactive 6 mentioned previously. Stoke could be classed as the most “reactive” side in the league based on this measure.

There isn’t a particularly large range in the second principal component for the English sides, probably due the multiple correlations embedded within this component. One interesting aspect is how all of the English teams are clustered to the left of the second principal component, which suggests that English teams attempt fewer tackles, make fewer interceptions and win/concede fewer fouls compared with the rest of Europe. Inspection of the raw data supports this. This contrasts with the clichéd blood and thunder approach associated with football in England, whereby crunching tackles fly in and new foreign players struggle to adapt to the intense tackling approach. No doubt there is more subtlety inherent in this area and the current analysis doesn’t include anything about the types of tackles/interceptions/fouls, where on the pitch they occur or who perpetrates them but this is an interesting feature pointed out by the analysis worthy of further exploration in the future.

The substantial gulf in quality between the top two sides in La Liga from the rest is well documented but this analysis shows how much they differed in style with the rest of the league last season. Real Madrid and Barcelona have more of the ball, take more shots and concede far fewer shots compared with their Spanish peers. However, in terms of style, La Liga is split into three groups: Barcelona, Real Madrid and the rest. PCA is very good at evaluating differences in a dataset and with this in mind we could describe Barcelona as the most “different” football team in these five leagues. Based on the first principal component, Barcelona are the most proactive team in terms of possession and this translates to their ratio of shots attempted:conceded; no team conceded fewer shots than Barcelona last season. This is combined with their pressing style without the ball, as they attempt more tackles and interceptions relative to many of their peers across Europe.

Teams from the Bundesliga are predominantly grouped to the right-hand-side of the second principal component, which suggests that teams in Germany are keen to regain possession relative to the other leagues analysed. The Spanish, Italian and French tend to fall between the two extremes of the German and English teams in terms of this component.

All models are wrong, but some are useful

The interpretation of the dataset is the major challenge here; Principal Component Analysis is purely a mathematical construct that doesn’t know anything about football! While the initial results presented here show potential, the analysis could be significantly improved with more granular data. For example, the second principal component could be improved by including information on where the tackles and interceptions are being attempted. Do teams in England sit back more compared with German teams? Does this explain the lower number of tackles/interceptions in England relative to other leagues? Furthermore, the passing and shooting variables could be improved with more context; where are the passes and shots being attempted?

The results are encouraging here in a broad sense – Barcelona do play a different style compared with Stoke and they are not at all like Swansea! There are many interesting features within the analysis, which are worthy of further investigation. This analysis has concentrated on the contrasts between different teams, rather than whether one style is more successful or “better” than another (the subject of a future post?). With that in mind, I’ll finish with this quote from Andrés Iniesta from his interview with Sid Lowe for the Guardian from the weekend.

…the football that Spain and Barcelona play is not the only kind of football there is. Counter-attacking football, for example, has just as much merit. The way Barcelona play and the way Spain play isn’t the only way. Different styles make this such a wonderful sport.

____________________________________________________________________

Background reading on Principal Component Analysis

  1. RealClimate

Assessing forward involvement

One of the more interesting innovations from an analytical standpoint at the current European Championship has been the measuring of the amount of time that a player spends with the ball per game. This measure of player involvement has in particular been applied to forward players, such as Mario Gomez. Gomez managed to score 3 goals from 6 shots in 2 games despite only having the ball for 22 seconds, according to Prozone. This contrasted with Robin Van Persie, who was seemingly more involved in general play, scoring 1 goal from 10 shots in 106 seconds.

This prompts the question: can we assess such player involvement on a wider level, with particular focus on forward players?

Without having access to the time in possession statistics, another measure is required. The number of passes per game should give a reasonable approximation of how involved a forward is in general play. Contrasting this with the number of shots attempted per game should provide a comparison between a forwards goal scoring duties and his overall involvement in play.

Top European League analysis

Below is a comparison of the number of shots a forward attempts per game vs the number of passes he attempts per game. The data is taken from WhoScored.com and is for all players classified as forwards and have started 10 games or more in the top division in England, Spain, Italy, Germany and France. The graph includes players who have played in a non-forward role at some point in the season, as defined by WhoScored. For example, Cristiano Ronaldo is classified as playing as both a left-sided attacking midfielder and forward, although in this case the distinction is likely irrelevant. Including players who have at some point played outside of the forward line makes little impact upon the general trend and averages (see table below).

Relationship between number of shots attempted per game vs number of passes attempted per game by forward players in the top division in England, Spain, Italy, Germany and France. The points are coloured by the number of goals scored by each player. The vertical dashed grey line indicates the average number of passes per game by these players, while the horizontal dashed grey line indicates the average number of shots attempted by these players. The text boxes (Z1, Z2, Z3, Z4) designate the zones of interest referred to in the text. All data is taken from WhoScored.com for the 2011/12 season. An interactive version of the plot is available here, where you can find any of the forwards included in the study.

Filter Players Shots/game Passes/game Goals
Forwards only 130 2.06±0.76 18.72±6.24 7.96±5.85
Mixed 135 2.10±1.01 24.67±8.82 8.23±7.57
All 265 2.08±0.89 21.75±8.21 8.10±6.77

Comparison of the different player position classifications prescribed by WhoScored. The mean and standard deviation for shots/game, passes/game and goals scored are given for each group. Mixed refers to players who have been classed as playing as both a forward and another position (generally as an attacking midfielder) at some point in the 2011/12 season.

In general, there is a weak positive relationship between shots attempted and passes attempted by forward players (correlation coefficient of 0.46 if you are that way inclined). The major feature though is that there is a great deal of variability across the forward players in terms of their involvement in player relative to their goal scoring attempts. An interactive version of the plot is available here, where you can find any of the forwards included in the study.

Players such as Mario Gomez and Jermain Defoe take an above average number of shots relative to the number of passes they attempt (Zone 1), with Gomez in particular being prolific for Bayern Munich with 26 goals in 30 Bundesliga starts. Other notable forwards with these traits include Antonio Di Natale, Robert Lewandowski, Edison Cavani, Mario Balotelli and Falcao who attempt a slightly below average number of passes but still attempt a large number of shots per game. Fernando Llorente and Andy Carroll also reside in this zone, with similar values for shots attempted and passes attempted. Players in this zone score 9.6 goals on average.

Several “star” forwards reside in Zone 2, where forwards take an above average number of shots and attempt an above average number of passes. The two extremes here are unsurprisingly Lionel Messi and Cristiano Ronaldo, who attempt the most passes and take the most shots respectively out of all of the forwards in the study. Messi ranks 34th for the number of passes across the top five European leagues, some 30 passes behind his Barcelona team-mate Xavi. Clearly, Messi’s false-nine role for Barcelona allows him to become extremely involved in general play and to even dictate it at times. He combines this with being Barcelona’s primary provider of shots on goal and indeed goals. Ronaldo is also involved significantly in Real Madrid’s play and incredibly attempts almost 7 shots per game. Several other notable forwards in this zone include Francesco Totti, Wayne Rooney, Zlatan Ibrahimovic, Raúl, Luis Suárez and Robin Van Persie with some of these forwards being more prolific than others. Clint Dempsey is an example of someone who generally plays outside of the forward line but is included here as he did play up-front for Fulham this season (scoring 5 goals in 5 games according to WhoScored). Players in this zone score 13.7 goals on average, although this is somewhat skewed by the exploits of Messi and Ronaldo (12.6 goals on average when excluding them).

Out of the 265 players included, 98 attempt both a lower than average number of shots and passes per game. In general, the number of goals scored in this group (Zone 3) is unremarkable, with the average goals scored per player being 5. However, there is one significant over-perfomer; Gonzalo Higuaín scored 22 league goals from 60 shots last season. In most squads, this would guarantee more games but he was up against Karim Benzema, who by comparison scored a paltry 21 goals from 100 shots. However, an added benefit of Benzema based on this analysis is that he is far more involved in general play.

The last group (Zone 4) includes players who take fewer shots than average but attempt more passes than average. Many of these players are more attacking midfield players than forwards, such as Dirk Kuyt. Again, a Fulham player is a good example of a player who rarely plays as a forward being included in the analysis, as Moussa Dembélé generally plays in midfield. Players in this zone score 5.3 goals on average, essentially the same as those in Zone 3.

Finishing the jigsaw

Clearly there is a large variation in how involved a forward player is in general play versus how often he attempts to score. Such differences are likely driven by both the individual player in terms of their skills and style of play alongside their tactical role within the team. Mario Gomez for instance has very similar numbers from the current European Championship for Germany as he does for his club side, although this could be a statistical quirk given the small sample size. It would be interesting to analyse how an individual performs by these measures across multiple games in multiple tactical systems.

There isn’t necessarily a better “zone” in this analysis but teams should bear these traits in mind when attempting to improve their squad. For example, Liverpool’s woes in front of goal last season led for calls for a simple poacher to be brought in who would simply “stick the ball in the net”. However, if by bringing in a poacher, Liverpool were to lose the passing and creativity provided by players in other areas, then you could end up exchanging one problem for another. Balance is key in such decisions; hopefully Brendan Rodgers can solve Liverpool’s goal scoring issues and at least maintain the quality of their chance creation next season.

Resting with the ball

Brendan Rodgers appointment as Liverpool manager has prompted some fascinating discussions about his overall playing philosophy and how it might be transferred to his new club. Swansea’s impressive passing statistics have been much quoted in this context; only Manchester City attempted and completed more than them last season.

An intriguing aspect of this preference for possession is that it is used as both an offensive and defensive tool. Michael Cox of Zonal Marking previously elaborated on the link between possession and shots attempted per game and showed that in general, teams with more possession had more shots, although there was a large degree of variation around the general trend. However, this only investigates the offensive aspect. The theory behind the defensive aspect is best outlined by the new Liverpool manager:

“Then there’s our defensive organisation…so if it is not going well we have a default mechanism which makes us hard to beat and we can pass our way into the game again. Rest with the ball. Then we’ll build again.”

The inference here is that by having the ball, the opposition can’t score while you simultaneously have increased your own chances of scoring as you need the ball in order to score. So the question is: is this true?

One method of ascertaining how well teams accomplish the twin goals of attempting shots on goal and preventing shots on their own goal is to take the ratio between them. If this ratio is greater than 1, then a team attempts more shots than it concedes. Conversely, if the ratio is less than 1, then a team concedes more shots than it attempts. This is by no means a perfect metric, as not all shots are created equal but it does give us something to begin with.

In order to assess whether this has any relationship with passing, I’ve plotted this ratio against the number of short passes attempted per game by each team in the top leagues in England, Spain, Italy, Germany and France in the figure below. The teams from each league are coloured differently and various teams are highlighted for comparison purposes/interest.

Relationship between shots attempted:conceded vs short passes per game from all teams in the top division in England, Spain, Italy, Germany and France. The vertical dashed grey line indicates the average number of short passes per game by these teams, while the horizontal dashed grey line indicates the average shots attempted:conceded ratio. All data is taken from WhoScored.com for the 2011/12 season.

Broadly, teams that attempt more short passes per game tend to attempt more shots than they concede (correlation coefficient of 0.8 if you are that way inclined). Unsurprisingly, the teams at the extreme ends of the number of short passes are Stoke (229 per game) and Barcelona (655 per game). Barcelona are also at the extreme end of the shots attempted:conceded ratio, achieving well above 2 times as many attempted shots compared to those they concede. This is largely driven by their ability to prevent their opponents taking shots, as Barcelona have the lowest number of shots conceded per game (only 7.3 per game). Barcelona’s shots attempted comes in 10th (16.5 per game). Barcelona are adept at “resting with the ball” but you probably already knew that. Many of the teams analysed attempt a below average number of short passes and concede more shots than they attempt. Ajaccio, FC Cologne and Santander posted the lowest shots attempted:conceded ratio, with the latter two being relegated.

Swansea & Liverpool

Swansea are one of the few teams that combined a large number of short passes per game with a well below average shots attempted:conceded ratio. The closest side to Swansea in this sense is Athletic Bilbao, another side who value possession and pressing highly. Clearly Swansea keep the ball well and translated this to a reasonable number of attempted shots per game (12.4, joint 15th highest in the EPL, mid-table across all 5 leagues). Furthermore, Swansea’s patient style of play seeks to create higher quality shooting opportunities; a lower number isn’t necessarily a bad thing as it isn’t artificially inflated by long-range pot-shots that threaten the corner flag rather than the goal.

However, compared to other teams that play an above average number of short passes per game, their shots conceded per game is relatively high (15.7, 7th highest in the EPL). Indeed, of the 11 teams that conceded more shots per game across the five leagues, 8 of them finished in the bottom 4 of their respective leagues (6 were relegated). As mentioned previously, not all shots are created equal but Swansea conceded 59% of these shots within their own penalty area, which was joint 4th highest in the EPL. Without delving further into numbers and analysis, this potentially suggests that Swansea are good at keeping the ball but perhaps were not as good at transitioning to their defensive duties either individually or collectively when they lost it.

Liverpool on average attempted close to 60% more shots than they conceded, with only 8 teams achieving a larger ratio. In Liverpool’s case, this was driven by both being able to execute a large number of shots on their opponent’s goal and combining this with a low number of shots on their own goal. Liverpool ranked 4th for shots attempted in the EPL (6th across all 5 leagues) and 3rd for shots conceded (15th across all 5 leagues). This was combined with the 7th highest number of short passes per game in the EPL. As has been shown many times over the past season, Liverpool’s major problem statistically was their woeful translation of shots to goals.

The way forward for Liverpool

Liverpool under Kenny Dalglish were hardly a team that could be described as a “route one” football team, although the passing style was at times impatient and overly focussed upon crossing to what often seemed like unidentified targets in the penalty area. However, there is a significant difference between the number of short passes played by Swansea (497 per game) compared to Liverpool (440 per game). Next season, Liverpool will presumably move towards and perhaps exceed 500 short passes per game as the influence of Rodgers’ possession orientated playing philosophy takes hold. At the very least, Liverpool should be looking to maintain their shots attempted:conceded ratio from last season to the next. A more patient style of play may help to deliver more players in the final third in order to create and take shooting opportunities. If such patience also delivers some more-composed finishing, then Liverpool under Brendan Rodgers could be very exciting indeed.

——————————————————————————————————————–

Credit

All the raw data is taken from WhoScored.com.

The quote from Brendan Rodgers is taken from the excellent analysis by Jed Davies which is linked below.

Further reading

  1. Roy Henderson on The Anfield Wrap
  2. Stephen McCarthy on EPL Index
  3. Mihail Vladimirov on The Tomkins Times (£)
  4. Jed Davies on The Path is Made By Walking