Counting counters

Over on StatsBomb, I’ve written about Leicester’s attacking exploits this season, specifically focusing on the style and effectiveness of their attack. That required a fair amount of research into various aspects relating to the speed and directness of teams attacks, which I’ve looked into since I started looking at possessions and expected goals.

One output of all that is a bunch of numbers at the team and player level stretching back over the past four seasons about fast-attacks and counter-attacks, some of which I will post below along with some comments.

As a brief reminder, a possession is a passage of play where a team maintains unbroken control of the ball. I class a possession moving at greater than 5 m/s on average as ‘fast’ based on looking at a bunch of diagnostics relating to all possessions i.e. not just those ending with a shot. The final number is fairly arbitrary as I just went with a round number rather than a precisely calculated one but the interpretation of the results didn’t shift much when altering the boundary. Looking at the data, there is probably some separation into slow attacks (<2 m/s), medium-paced attacks (2-5 m/s) and then the fast attacks (>5 m/s). Note that some attacks go away from goal, so they end up with a negative speed (technically I’m calculating velocity here but I’ll leave that for another time), so these are attacks towards the goal.

Counter-attacks are when these fast-paced moves begin in a teams own half. Again this is fairly arbitrary from a data point-of-view but it at least fits in with what I think most would consider to be a counter-attack and it’s very easy to split the data into narrower bands in future.

I should add that Michael Caley has published analysis and data relating to counter-attacking, although he is apparently in the process of revising these.

All of the numbers below are based on my expected goals model using open-play shots only. I don’t include a speed of attack or counter-attacking adjustment in my model.

So, without further ado, here are some graphs…

Top-20 offensive fast-attacking teams

Fast_xGfor_Top20.png

Top 20 teams in terms of fast-attacking expected goals for over the past four seasons.

Champions Elect Leicester City sit atop the pile with a reasonable gap on THAT Liverpool team, with a fairly big drop to the chasing pack behind. Arsenal and Manchester City are quite well represented here illustrating the diversity of their attacks – while both are typically among the slowest teams on average, they can step it up effectively when presented with the opportunity.

Top-20 offensive counter-attacking teams

Counter_xGfor_Top20.png

Top 20 teams in terms of counter-attacking expected goals for over the past four seasons.

Number one isn’t a huge shock, with this years Leicester City narrowly ahead of the 12/13 iteration of Liverpool. A lot of the same teams are found in both the fast-attacking and counter-attacking brackets, which isn’t a great surprise perhaps.

Southampton this year are perhaps a little surprising and it is a big shift from previous seasons (0.056-0.075 per game), although I’ll admit I haven’t paid them that much attention this year. Their defense is the 6th worst in this period on counter-attacks also (3rd worst on fast-attacks). When did Southampton become a basketball team?

What is particularly noticeable is the prevalence of teams from the past two seasons in the top-10. A trend towards more-transition orientated play? Something to examine in more detail at another time perhaps.

Top-20 defensive fast-attacking teams

Fast_xGagainst_Top20.png

Top 20 teams in terms of fast-attacking expected goals against over the past four seasons.

Most of the best performances on the defensive side are from the 12/13 and 13/14 seasons, which might give some credence to a greater emphasis more recently on transitions along with an inability to cope with them.

The list overall is populated by the relative mainstays of Manchester City, Liverpool and West Brom along with various fingerprints from Mourinho, Warnock and Pulis

Top-20 defensive counter-attacking teams

Counter_xGagainst_Top20

Top 20 teams in terms of counter-attacking expected goals against over the past four seasons.

Interestingly there is a greater diversity between the counter-attacking and fast-attacking metrics on the defensive side of the ball than on the offensive side, which might point to potential strengths and/or weaknesses in certain teams.

Spurs last season rank as the worst defensive side in terms of counter-attacking expected goals against, and are narrowly beaten into second spot for fast-attacks by the truly awful 2012/13 Reading team.

Top-20 fast-attacking players

Fast_Players_Top20

Top 20 players in terms of fast-attacking expected goals per 90 minutes over the past four seasons. Minimum 2,700 minutes played.

Lastly, we’ll take a quick look at players. For now, I’m just isolating the player who took the shot, rather than those who participated in the build-up to the goal. A lot of this will be tied up in playing style and team effects.

Jamie Vardy is clearly the standout name here, followed by Daniel Sturridge and Danny Ings. Sturridge leads the chart in terms of actual goals with 0.21 goals per 90 minutes, with Vardy third on 0.18.

Vardy’s overall open-play expected goals per 90 minutes stands at 0.26 by my numbers over the past two seasons, so over half of his xG per 90 comes from getting on the end of fast-attacking moves. He sits in 16th place over all for those with over 2,700 minutes played, which is respectable but he is clearly elite when it comes to faster-paced attacks.

Top-20 counter-attacking players

Counter_Players_Top20.png

Top 20 players in terms of counter-attacking expected goals per 90 minutes over the past four seasons. Minimum 2,700 minutes played.

Danny Ings sits on top when it comes to counter-attacking, which bodes well for his future under Jürgen Klopp at Liverpool, providing his injury hasn’t unduly affected him. Again, Sturridge leads the list in terms of actual goals with 0.13 per 90 minutes, with Vardy second on 0.12. The sample sizes are lower here, so we would expect a greater degree of variance in terms of the comparison between reality and expectation.

One of the interesting things when comparing these lists is the divergence and/or similarities between the overall goal scorer chart. For example, Edin Džeko and Wilfried Bony sit in first and fourth place respectively in the overall table for this period but lie outside the top-20 when it comes to faster-paced attacks. A clear application of this type of work is player profiling to fit the particular style and needs of a prospective team, which Paul Riley has previously shown to be a useful method for evaluating forwards.

Moving forward

I wanted to post these as a starting point for discussion before I drill down further into the details in the future. The data presented here and that underlying it are very rich in detail and potential applications, which I have already started to explore. In particular, there is a lot of spatial information encapsulated in the data that can inform how teams attack and defend, which can help to build further descriptive elements to team styles along side measures of their effectiveness.

I’ll keep you posted.

Recruitment by numbers: the tale of Adam and Bobby

One of the charges against analytics is that it hasn’t really demonstrated its utility, particularly in relation to recruitment. This is an argument I have some sympathy with. Having followed football analytics for over three years, I’m well-versed in the metrics that could aid decision making in football but I can appreciate that the body of work isn’t readily accessible without investing a lot of time.

Furthermore, clubs are understandably reticent about sharing the methods and processes that they follow, so successes and failures attributable to analytics are difficult to unpick from the outside.

Rather than add to the pile of analytics in football think-pieces that have sprung up recently, I thought I would try and work through how analysing and interpreting data might work in practice from the point of view of recruitment. Show, rather than tell.

While I haven’t directly worked with football clubs, I have spoken with several people who do use numbers to aid recruitment decisions within them, so I have some idea of how the process works. Data analysis is a huge part of my job as a research scientist, so I have a pretty good understanding of the utility and limits of data (my office doesn’t have air-conditioning though and I rarely use spreadsheets).

As a broad rule of thumb, public analytics (and possibly work done in private also) is generally ‘better’ at assessing attacking players, with central defenders and goalkeepers being a particular blind-spot currently. With that in mind, I’m going to focus on two attacking midfielders that Liverpool signed over the past two summers, Adam Lallana and Roberto Firmino.

The following is how I might employ some analytical tools to aid recruitment.

Initial analysis

To start with I’m going to take a broad look at their skill sets and playing style using the tools that I developed for my OptaPro Forum presentation, which can be watched here. The method uses a variety of metrics to identify different player types, which can give a quick overview of playing style and skill set. The midfielder groups isolated by the analysis are shown below.

Midfielders

Midfield sub-groups identified using the playing style tool. Each coloured circle corresponds to an individual player. Data via Opta.

I think this is a useful starting point for data analysis as it can give a quick snap shot of a player and can also be used for filtering transfer requirements. The utility of such a tool is likely dependent on how well scouted a particular league is by an individual club.

A manager, sporting director or scout could feed into the use of such a tool by providing their requirements for a new signing, which an analyst could then use to provide a short-list of different players. I know that this is one way numbers are used within clubs as the number of leagues and matches that they take an interest in outstrips the number of ‘traditional’ scouts that they employ.

As far as our examples are concerned, Lallana profiles as an attacking midfielder (no great shock) and Firmino belongs in the ‘direct’ attackers class as a result of his dribbling and shooting style (again no great shock). Broadly speaking, both players would be seen as attacking midfielders but the analysis is picking up their differing styles which are evident from watching them play.

Comparing statistical profiles

Going one step further, fairer comparisons between players can be made based upon their identified style e.g. marking down a creative midfielders for taking a low number of shots compared to a direct attacker would be unfair, given their respective roles and playing style.

Below I’ve compared their statistical output during the 2013/14 season, which is the season before Lallana signed for Liverpool and I’m going to make the possibly incorrect assumption that Firmino was someone that Liverpool were interested in that summer also. Some of the numbers (shots, chances created, throughballs, dribbles, tackles and interceptions) were included in the initial player style analysis above, while others (pass completion percentage and assists) are included as some additional context and information.

The aim here is to give an idea of the strengths, weaknesses and playing style of each player based on ranking a player against their peers. Whether a player ranks low or high on a particular metric is a ‘good’ thing or not is dependent on the statistic e.g. taking shots from outside the box isn’t necessarily a bad thing to do but you might not want to be top of the list (Andros Townsend in case you hadn’t guessed). Many will also depend on the tactical system of their team and their role within it.

The plots below are to varying degrees inspired by Ted Knutson, Steve Fenn and Florence Nightingale (Steve wrote about his ‘gauge’ graph here). There are more details on these figures at the bottom of the post*.

Lallana.

Data via Opta.

Lallana profiles as a player who is good/average at several things, with chances created seemingly being his stand-out skill here (note this is from open-play only). Firmino on the other hand is strong and even elite at several of these measures. Importantly, these are metrics that have been identified as important for attacking midfielders and they can also be linked to winning football matches.

Firmino.

Data via Opta.

Based on these initial findings, Firmino looks like an excellent addition, while Lallana is quite underwhelming. Clearly this analysis doesn’t capture many things that are better suited to video and live scouting e.g. their defensive work off the ball, how they strike a ball, their first touch etc.

At this stage of the analysis, we’ve got a reasonable idea of their playing style and how they compare to their peers. However, we’re currently lacking further context for some of these measures, so it would be prudent to examine them further using some other techniques.

Diving deeper

So far, I’ve only considered one analytical method to evaluate these players. An important thing to remember is that all methods will have their flaws and biases, so it would be wise to consider some alternatives.

For example, I’m not massively keen on ‘chances created’ as a statistic, as I can imagine multiple ways that it could be misleading. Maybe it would be a good idea then to look at some numbers that provide more context and depth to ‘creativity’, especially as this should be a primary skill of an attacking midfielder for Liverpool.

Over the past year or so, I’ve been looking at various ways of measuring the contribution and quality of player involvement in attacking situations. The most basic of these looks at the ability of a player to find his team mates in ‘dangerous’ areas, which broadly equates to the central region of the penalty area and just outside it.

Without wishing to go into too much detail, Lallana is pretty average for an attacking midfielder on these metrics, while Firmino was one of the top players in the Bundesliga.

I’m wary of writing Lallana off here as these measures focus on ‘direct’ contributions and maybe his game is about facilitating his team mates. Perhaps he is the player who makes the pass before the assist. I can look at this also using data by looking at the attacks he is involved in. Lallana doesn’t rise up the standings here either, again the quality and level of his contribution is basically average. Unfortunately, I’ve not worked up these figures for the Bundesliga, so I can’t comment on how Firmino shapes up here (I suspect he would rate highly here also).

Recommendation

Based on the methods outlined above, I would have been strongly in favour of signing Firmino as he mixes high quality creative skills with a goal threat. Obviously it is early days for Firmino at Liverpool (a grand total of 239 minutes in the league so far), so assessing whether the signing has been successful or not would be premature.

Lallana’s statistical profile is rather average, so factoring in his age and price tag, it would have seemed a stretch to consider him a worthwhile signing based on his 2013/14 season. Intriguingly, when comparing Lallana’s metrics from Southampton and those at Liverpool, there is relatively little difference between them; Liverpool seemingly got the player they purchased when examining his statistical output based on these measures.

These are my honest recommendations regarding these players based on these analytical methods that I’ve developed. Ideally I would have published something along these lines in the summer of 2014 but you’ll just have to take my word that I wasn’t keen on Lallana based on a prototype version of the comparison tool that I outlined above and nothing that I have worked on since has changed that view. Similarly, Firmino stood out as an exciting player who Liverpool could reasonably obtain.

There are many ways I would like to improve and validate these techniques and they might bear little relation to the tools used by clubs. Methods can always be developed, improved and even scraped!

Hopefully the above has given some insight into how analytics could be a part of the recruitment process.

Coda

If analytics is to play an increasing role in football, then it will need to build up sufficient cachet to justify its implementation. That is a perfectly normal sequence for new methods as they have to ‘prove’ themselves before seeing more widespread use. Analytics shouldn’t be framed as a magic bullet that will dramatically improve recruitment but if it is used well, then it could potentially help to minimise mistakes.

Nothing that I’ve outlined above is designed to supplant or reduce the role of traditional scouting methods. The idea is just to provide an additional and complementary perspective to aid decision making. I suspect that more often than not, analytical methods will come to similar conclusions regarding the relative merits of a player, which is fine as that can provide greater confidence in your decision making. If methods disagree, then they can be examined accordingly as a part of the process.

Evaluating players is not easy, whatever the method, so being able to weigh several assessments that all have their own strengths, flaws, biases and weaknesses seems prudent to me. The goal of analytics isn’t to create some perfect and objective representation of football; it is just another piece of the puzzle.

truth … is much too complicated to allow anything but approximations – John von Neumann


*I’ve done this by calculating percentile figures to give an indication of how a player compares with their peers. Values closer to 100 indicate that a player ranks highly in a particular statistic, while values closer to zero indicate they attempt or complete few of these actions compared to their peers. In these examples, Lallana and Firmino are compared with other players in the attacking midfielder, direct attacker and through-ball merchant groups. The white curved lines are spaced every ten percentiles to give a visual indication of how the player compares, with the solid shading in each segment corresponding to their percentile rank.

Liverpool Looking Up? EPL 2015/16 Preview

Originally published on StatsBomb.

After the sordid love affair that culminated in a strong title challenge in 2013/14, Liverpool barely cast a furtive glance at the Champions League places in 2014/15. Their underlying numbers over the whole season provided scant consolation either, with performance levels in line with a decent team lacking the quality usually associated with a top-four contender. Improvements in results and underlying performance will therefore be required to meet the club’s stated aim of Champions League football.

Progress before a fall

Before looking forward to the coming season, let’s start with a look back at Liverpool’s performance over recent seasons. Below is a graphic showing Liverpool’s underlying numbers over the past five seasons, courtesy of Paul Riley’s Expected Goal numbers.

Expected goal rank over the past 5 seasons of the English Premier League. Liverpool seasons highlighted in red.

Expected goal rank over the past 5 seasons of the English Premier League. Liverpool seasons highlighted in red.

From 2010/11 to 2012/13, there was steady progress with an impressive jump in 2013/14 to the third highest rating over the past five years. Paul’s model only evaluates shots on target, so Liverpool’s 2013/14 rating is potentially biased a little high given their unusual/unsustainable proportion of shots on target that year. However, the quality was clear, particularly in attack. Not to be outdone, 2014/15 saw another impressive jump but unfortunately the trajectory was in the opposite direction. Other metrics such as total shots ratio and shots on target ratio tell a similar story, although 2013/14 isn’t quite as impressive.

The less charitable among you may ascribe Liverpool’s trajectory with the presence and performance of one Luis Suárez; when joining in January 2010, Suárez was an erratic yet gifted performer who went on to become a genuine superstar before departing in the summer of 2014. Suárez’s attacking wizardry in 13/14 was remarkable and he served as a vital multiplier in the sides’ pinball style of play. Clearly he was a major loss but there were already reasons to suspect that some regression was due with or without him: Andrew Beasley wrote about the major and likely unsustainable role of set piece goals, while James Grayson and Colin Trainor highlighted the unusually favourable proportions of shots on target and blocked shots respectively during their title challenge. I wrote about how Liverpool’s penchant for early goals had led to an incredible amount of time spent winning over the season (a handy circumstance for a team so adept at counter-attacking), which may well have helped to explain some of their unusual numbers and that it was unlikely to be repeated.

These mitigating and potentially unsustainable factors notwithstanding, the dramatic fall in underlying performance, points (22 in all) and goals scored (an incredible 49 goal decline) is where Liverpool find themselves ahead of the coming season. Such a decline sees Brendan Rodgers go into this season under pressure to justify FSG’s backing of him over the summer, particularly with a fairly nightmarish run of away fixtures to start the season and the spectre of Jürgen Klopp on the horizon.

So, where do Liverpool need to improve this season?

Case for the defence

With the concession of six goals away at Stoke fresh in the memory, the narrative surrounding Liverpool’s defence is strong i.e. the defence is pretty horrible. Numbers paint a somewhat different story with Liverpool’s shots conceded (10.9 per game) standing as the joint-fifth lowest in the league last year according to statistics compiled by the Objective-Football website (rising to fourth lowest in open play). Shots on target were less good (3.8 per game and a rank of joint-seventh) although the margins are fairly small here. By Michael Caley’s and Paul Riley’s expected goal numbers, Liverpool ranked fourth and sixth respectively in expected goals against. Looking at how effective teams were at preventing their opponents from getting the ball into dangerous areas in open-play, my own numbers ranked Liverpool fifth best in the league.

It should be noted that analytics often has something of a blind spot when it comes to analysing defensive performances; metrics which typically work very well on the offensive side often work less well on the defensive side. Liverpool also tend to be a fairly dominant team and their opponents typically favour a deep defence and counter strategy against them, which will limit the number of chances they create.

One area where their numbers (courtesy of Objective-Football again) were noticeably poor was at set-pieces where they conceded on 11.6% of the shots by their opponents, which was 3rd worst in the league, compared to a league average conversion of 8.7%. Set-piece conversion rates are notoriously unsustainable year-on-year though, so some regression towards more normal conversion rates could potentially bring down Liverpool’s goal per game average compared to last season.

While Liverpool’s headline numbers were reasonable, their tendency to shoot themselves in the foot and concede some daft goals was impressive in its ineptitude at times. Culprits typically included combinations of Rodgers’ tactics, Dejan Lovren’s ‘whack a mole’ approach to defending and the embers of Steven Gerrard’s Liverpool career. The defensive structure of the team should be improved now that Gerrard no longer needs to be accommodated at the heart of midfield, while Glen Johnson’s prolonged audition for an extra role in the Walking Dead will continue at Stoke. Nathaniel Clyne should be a significant upgrade at full back, with youngsters Ilori and Gomez presently with the squad and aiming to compete for a first team role.

Broadly speaking though, Liverpool’s defensive numbers were reasonable but with room for improvement. Their numbers looked ok for a Champions League hopeful rather than a title challenger. A more mobile midfield should enhance the protection afforded to the central defence, however it should line up. Whether the individual errors were a bug and not a feature of this Liverpool team will likely determine how the narrative around the defence continues this year.

Under-powered attack

Liverpool’s decline in underlying performance in 2014/15 was driven by a significant drop-off in their attacking numbers. The loss of Suárez was compounded by Daniel Sturridge playing just 750 minutes in the league all season; Sturridge isn’t at the same level as Suárez (few are) but he does represent a truly elite forward and the alternatives at the club weren’t able to replace him.

The loss of Suárez and Sturridge meant that Coutinho and Sterling were now the principal conduits for Liverpool’s attack. Both performed admirably and were among the most dangerous attackers in the division. The figure below details Liverpool’s players according to the number of dangerous passes per 90 minutes played, which is related to my pass-danger rating score. In terms of volume, Coutinho and Sterling were way ahead of their teammates and both ranked in the top 15 in the league (minimum of 900 minutes played). James Milner actually ranked seventh by this metric, so he could well provide an additional source of creativity and link well with Liverpool’s forward players.

Dangerous passes per 90 minutes played metric for Liverpool players in 2014/15. Right hand side shows total number of completed passes per 90 minutes.

Dangerous passes per 90 minutes played metric for Liverpool players in 2014/15. Right hand side shows total number of completed passes per 90 minutes.

As good as Coutinho and Sterling were from a creative perspective, they did lag behind the truly elite players in the league by these metrics. As with many of Liverpool’s better players, you’re often left with the caveat of stating how good they are for their age. That’s not a criticism of the players themselves, merely a recognition of their overall standing relative to their peers.

What didn’t help was the lack of attacking contribution from Liverpool’s peak-age attacking players; Lallana’s contribution was decidedly average, Sturridge is obviously capable of making a stellar contribution but injuries curtailed him, while Balotelli certainly provided a high shot volume powered by a predilection for shooting from range but a potential dose of bad luck meant his goal-scoring record was well below expectation.

While there were clearly good elements to Liverpool’s attack, they were often left shooting from long range. According to numbers published by Michael Caley, Liverpool took more shots from outside the box than any other team last year and had the fourth highest proportion of shots from outside the box (48%). Unsurprisingly, they had the third lowest proportion of shots from the central region inside the penalty area (34%), which is the so-called ‘danger zone’ where shots are converted at much greater rates than wide in the box and outside the area. With their shot volumes being pretty good last season (third highest total shots and fourth highest shots on target), shifting the needle towards better quality chances would certainly improve Liverpool’s prospects. The question is where will that quality come from?

Bobby & Ben

With Sturridge not due back until the autumn coupled with his prior injury record, Liverpool moved to sign Christian Benteke as a frontline striker with youngsters Ings and Origi brought in to fill out the forward ranks. Roberto Firmino was added before Sterling’s departure but the expectation is that he will line-up in a similar role as the dynamic attacking midfielder/forward.

Firmino brings some impressive statistical pedigree with him: elite dribbler, dangerous passer, a tidy shot profile for a non-striker and stand-out tackling numbers for his position. If he can replicate his Bundesliga form then he should be a more than adequate replacement for Sterling, while also having the scope to develop over coming seasons.

Benteke brings a good but not great goal-scoring record, with his record in open-play being particularly average. Although there have been question marks regarding his stylistic fit within the team, Liverpool have seemingly been pursuing a physical forward to presumably act as a ‘reference point’ in their tactical system over the past few years; Diego Costa was a target in 2013, while Wilfred Bony was linked in 2014. Benteke brings that to the table alongside a more diverse range of skills than he is given credit for having been seemingly cast as an immobile lump of a centre forward by some.

Whether he has the necessary quality to improve this Liverpool team is the more pertinent question. From open-play, Benteke averages 2.2 shots per 90 minutes and 0.34 goals per 90 minutes over the past three seasons, which is essentially the average rate for a forward in the top European leagues. For comparison, Daniel Sturridge averages 4.0 shots per 90 minutes and 0.65 goals per 90 minutes over the same period. Granted, Sturridge has played for far greater attacking units than Aston Villa over that period but based on some analysis of strikers moving clubs that I’ve done, there is little evidence that shot and goal rates rise when moving to a higher quality team. Benteke does provide a major threat from set-pieces, which has been a productive source of goals for him but I would prefer to view these as an added extra on top of genuine quality in open-play, rather than a fig leaf.

Benteke will need to increase his contribution significantly if he is to cover for Sturridge over the coming season, otherwise Liverpool may find themselves in the good but not great attacking category again.

Conclusion

So where does all of the above leave Liverpool going into the season? Most of the underlying numbers for last season suggested that Chelsea, Manchester City and Arsenal were well ahead of the pack and I don’t see much prospect of one of them dropping out of the top four. Manchester United, Liverpool and Southampton made up the trailing group, with these three plus perhaps Tottenham in a battle to be the ‘best of the rest’ or ‘least crap’ and claim the coveted fourth place trophy.

When framed this way, Liverpool’s prospects look more viable, although fourth place looks like the ceiling at present unless the club procure some adamantium to alleviate Sturridge’s injury woes. While Liverpool currently operate outside the financial Goldilocks zone usually associated with a title challenge, they should have the quality to mount a concerted challenge for that Champions League spot in what could be a tight race. They did put together some impressive numbers during the 3-4-3 phase of last season that was in-line with those expected of a Champions League contender; replicating and sustaining that level of quality should be the aim for the team this coming season.

Prediction: 4-6th, most likely 5th.

P.S. Can Liverpool to be more fun this year? If you can’t be great, at least be fun.

Territorial advantage?

One of the recurring themes regarding the playing style of football teams is the idea that teams attempt to strike a balance between controlling space and controlling possession. The following quote is from this Jonathan Wilson article during the European Championships in 2012, where he discusses the spectrum between proactive and reactive approaches:

Great teams all have the same characteristic of wanting to control the pitch and the ball – Arrigo Sacchi.

No doubt there are multiple ways of defining both sides of this idea.

Controlling the ball is usually represented by possession, that is the proportion of the passes that a team plays in a single match or series of matches. If a team has the ball, then by definition, they are controlling it.

One way of defining the control of space is to think about ball possession in relation to the location of the ball on the pitch. A team that routinely possesses the ball closer to their opponents goal potentially benefits from the increased attacking opportunities that this provides, while also benefiting from the ball being far away from their own goal should they lose it.

There are certainly issues with defining control of space in this way though e.g. a well-drilled defence may be happy to see a team playing the ball high up the pitch in front of them, especially if they are adept at counter-attacking when they win the ball back.

Below is a heat map of the location of received passes in the 2013/14 English Premier League. The play is from left-to-right i.e. the team in possession is attacking towards the right-hand goal. We can see that passes are most frequently received in midfield areas, with the number of passes received decreasing quickly as we head towards each penalty area.

Text.

Heat map of the location of received passes in the 2013/14 English Premier League. Data via Opta.

Below is another heat map showing pass completion percentage based on the end point of the pass. The completion percentage is calculated by adding up all of the passes to a particular area on the pitch and comparing that to the number of passes that are successfully received. One thing to note here is that the end point of uncompleted passes relates to where possession was lost, as the data doesn’t know the exact target of each pass (mind-reading isn’t part of the data collection process as far as I know). That does mean that the pass completion percentage is an approximation but this is based on over 300,000 passes, so the effect is likely small.

What is very clear from the below graphic is that when within a teams own half, passes are completed routinely. The only areas where this drops are near the corner flags; I assume this is due to players either clearing the ball or playing it against an opponent when boxed into the corner.

Text.

Heat map of pass completion percentage based on the target of all passes in the 2013/14 English Premier League. Data via Opta.

As teams move further into the attacking half, pass completion drops. In the central zone within the penalty area, less than half of all passes are completed and this drops to less than 20% within the six yard box. These passes within the “danger zone” are infrequent and completed far less frequently than other passes. This danger zone is frequently cited by analysts looking at shot location data as the prime zone for scoring opportunities; you would imagine that receiving passes in this zone would be beneficial.

None of the above is new. In fact, Gabe Desjardins wrote about these features using data from a previous Premier League season here and showed broadly similar results (thanks to James Grayson for highlighting his work at various points). The main thing that looks different is the number of passes played into the danger zone, I’m not sure why this is but 2012/13 and 2014/15 so far look very similar to the above in my data.

Gabe used these results to calculate a territory statistic by weighting each pass by its likelihood of being completed. He found that this measure was strongly related to success and the performance of a team.

Below is my version of territory plotted against possession for the 2013/14 Premier League season. Broadly there are four regimes in the below plot:

  1. Teams like Manchester City, Chelsea and Arsenal who dominate territory and have plenty of possession. These teams tend to pin teams in close to their goal.
  2. Teams like Everton, Liverpool and Southampton who have plenty of possession but don’t dominate territory (all there are just under a 50% share). Swansea are an extreme case in as they have lots of possession but it is concentrated in their own half where passes are easier to complete.
  3. Teams like West Brom and Aston Villa who have limited possession but move the ball into attacking areas when they do have it. These are quite direct teams, who don’t waste much time in their build-up play. Crystal Palace are an extreme in terms of this approach.
  4. Teams that have limited possession and when they do have it, they don’t have much of it in dangerous areas at the attacking end of the pitch. These teams are going nowhere, slowly.
Text.

Territory percentage plotted against possession for English Premier League. Data via Opta.

Liverpool are an interesting example, as while their overall territory percentage ranks at fourteenth in the league, this didn’t prevent them moving the ball into the danger zone. For just passes received within the danger zone, they ranked third on 3.4 passes per game behind Chelsea (3.8) and Manchester City (4) and ahead of Arsenal on 2.9.

This ties in with Liverpool’s approach last season, where they would often either attack quickly when winning the ball or hold possession within their own half to try and draw teams out and open up space. Luis Suárez was crucial in this aspect, as he averaged 1.22 completed passes into the danger zone per 90 minutes. This was well ahead of Sergio Agüero in second place on 0.94 per 90 minutes.

The above is just a taster of what can be learnt from this type of data. I’ll be expanding on the above in more detail and for more leagues in the future.

Win, lose or draw

The dynamics of a football match are often dictated by the scoreline and teams will often try to influence this via their approach; a fast start in search of an early goal, keeping it tight with an eye on counter-attacking or digging a moat around the penalty area.

With this in mind, I’m going to examine the repeatability of the amount of time a team spends winning, losing and drawing from year to year. I’m basically copying the approach of James Grayson here who has looked at the repeatability of several statistical metrics. This is meant to be a broad first look; there are lots of potential avenues for further study here.

I’ve collected data from football-lineups.com (tip of the hat to Andrew Beasley for alerting me to the data via his blog) for the past 15 English Premier League seasons and then compared each teams performance from one season (year zero) to the next (year one). Promoted or relegated teams are excluded as they don’t spend two consecutive seasons in the top flight.

Losers

Below is a plot showing how the time spent losing varies in consecutive seasons. Broadly speaking, there is a reasonable correlation from one season to the next but with a degree of variation also (R^2=0.41). The data suggests that 64% of time spent winning is repeatable, leaving 36% in terms of variation from one season to the next. This variation could result due to many factors such as pure randomness/luck, systemic or tactical influences, injury, managerial and/or player changes etc.

Blah.

Relationship between time spent losing per game from one season to the next.

As might be expected, title winning teams and relegated sides tend towards the extreme ends in terms of time spent losing. Generally, teams at these extreme ends in terms of success over and under perform respectively compared to the previous season.

Winners

Below is the equivalent plot for time spent winning. Again there is a reasonable correlation from one season to the next, with the relationship for time spent winning (R^2=0.47) being stronger than for time spent losing. The data suggests that 67% of time spent winning is repeatable, leaving 33% in terms of variation from one season to the next.

Blah.

Relationship between time spent winning per game from one season to the next.

As might be expected, title winning teams spend a lot of time winning. The opposite is true for relegated teams. Title winners generally improve their time spent winning compared to the previous season. Interestingly, they often then see a drop off in the following season.

Manchester City and Liverpool really stick out here in terms of their improvement relative to 2012/13. Liverpool spent 19 minutes more per game in a winning position in 2013/14 than they did the previous season; I have this as the second biggest improvement in the past 15 seasons. They were narrowly pipped into second place (sounds familiar) by Manchester City this season, who improved by close to 22 minutes. They spent 51 and 48 minutes in a winning position per game respectively. They occupy the top two slots for time spent winning in the past 15 seasons.

According to football-lineups.com, Manchester City and Liverpool scored their first goals of the match in the 26th and 27th minutes respectively. Chelsea were the next closest in the 38th minute. They were also in the top four for how late they conceded their first goal on average, with Liverpool conceding in the 55th minute and City in the 57th. Add in their ability to rack up the goals when leading and you have a recipe for spending a lot of time winning.

Illustrators

The final plot below is for time spent drawing. Football-lineups doesn’t report the figures for drawing directly so I just estimated it by subtracting the winning and losing figures from 90. There will be some error here as this doesn’t account for injury time but I doubt it would hugely alter the general picture. The relationship here from season to season is almost non-existent (R^2=0.013), which implies that time spent drawing regresses to the mean by 89% from season to season.

Blah.

Relationship between time spent drawing per game from one season to the next.

Teams seemingly have limited control on the amount of time they spend drawing. I suspect this is a combination of team quality and incentives. Good teams have a reasonable control on the amount of time they spend winning and losing (as seen above) and it is in their interests to push for a win. Bad teams will face a (literally) losing battle against better teams in general, leading to them spending a lot of time losing (and not winning). It should be noted that teams do spend a large proportion of their time drawing though (obviously this is the default setting for a football match given the scoreline starts at 0-0), so it is an important period.

We can also see the shift in Liverpool and Manchester City’s numbers; they replaced fairly average numbers for time spent drawing in 2012/13 with much lower numbers in 2013/14. Liverpool’s time spent drawing figure of 29.8 minutes this season was the lowest value in the past 15 seasons according to this data!

Baked

There we have it then. In broad terms, time spent winning and losing exhibit a reasonable degree of repeatability but with significant variation superimposed. In particular, it seems that title winners require a boost in their time spent winning and a drop in their time spent losing to claim their prize. Perhaps unsurprisingly, things have to go right for you to win the title.

As far as this season goes, Manchester City and Liverpool both improved their time spent winning dramatically. If history is anything to go by, both will likely regress next season and not have the scoreboard so heavily stacked in their favour. It will be interesting to see how they adapt to such potential challenges next year.

Luis Suárez: Home & away

Everyone’s favourite riddle wrapped in an enigma was a topic of Twitter conversation between various analysts yesterday. The matter at hand was Luis Suárez’s improved goal conversion this season compared to his previous endeavours. Suárez has previously been labelled as inefficient by members of the analytics community (not the worst thing he has been called mind), so explaining his upturn is an important puzzle.

In the 2012/13 season, Suárez scored 23 goals from 187 shots, giving him a 12.3% conversion rate. So far this season he has scored 25 goals from 132 shots, which works out at 18.9%.

What has driven this increased conversion?

Red Alert

Below I’ve broken down Suárez’s goal conversion exploits into matches played home and away over the past two seasons. In terms of sample sizes, in 2012/13 he took 98 shots at home and 89 shots away, while he has taken 69 and 63 respectively this season.

Season Home Away Overall
2012/13 11.2% 13.5% 12.3%
2013/14 23.2% 14.3% 18.9%

The obvious conclusion is that Suárez’s improved goal scoring rate has largely been driven by an increased conversion percentage at home. His improvement away is minor, coming in at 0.8% but his home improvement is a huge 12%.

What could be driving this upturn?

Total Annihilation

Liverpool’s home goal scoring record this season has seen them average 3 goals per game compared to 1.7 last season. Liverpool have handed out several thrashings at home this season, scoring 3 or more goals in nine of their fourteen matches. Their away goal scoring has improved from 2 goals per game to 2.27 per game for comparison.

Liverpool have been annihilating their opponents at home this season and I suspect Suárez is reaping some of the benefit of this with his improved goal scoring rate. Liverpool have typically gone ahead early in their matches at home this season but aside from their initial Suárez-less matches, that hasn’t generally seen them ease off in terms of their attacking play (they lead the league in shots per game at home with 20.7).

My working theory is that Suárez has benefited from such situations by taking his shots under less pressure and/or better locations when Liverpool have been leading at home. I would love to hear from those who collect more detailed shot data on this.

Drilling down into some more shooting metrics at home adds some support to this. Suárez has seen a greater percentage of his shots hit the target at home this season compared with last (46.4% vs 35.7%). He has also seen a smaller percentage being blocked this season (13% vs 24.5%). Half of Suárez’s shots on target at home this season have resulted in a goal compared to 31.4% last season. Away from home, the comparison between this season and last is much closer.

These numbers are consistent with Suárez taking his shots at home this season in better circumstances. I should stress that there is a degree of circularity here as Suárez’s goal scoring is not independent of Liverpool’s. Further analysis is required.

Starcraft

The above is an attempt to explain Suárez’s improved goal scoring form. I doubt it is the whole story but it hopefully provides some clues ahead of more detailed analysis. Suárez may well have also benefited from a hot-streak this season and the big question will be whether he can sustain his goal scoring form over the remainder of this season and into next.

As I’ve shown previously, there is a large amount of variability in player shot conversion from season to season. Some of this will be due to ‘luck’ or randomness but some of this could be due to specific circumstances such as those potentially aiding Suárez this season. Explaining the various factors involved in goal scoring is a tricky puzzle indeed.

——————————————————————————————————————–

All data in this post are from Squawka and WhoScored.

Newcastle United vs Liverpool: passing network analysis

Liverpool defeated Newcastle 6-0 at St James’ Park. Below is the passing network analysis for Liverpool split between the first 75 minutes of the match and the rest of the match up to full time. I focussed just on Liverpool here. More information on how these are put together is available here in my previous posts on this subject.

The reason I separated the networks into these two periods was that I noticed how Liverpool’s passing rate changed massively after Steven Gerrard was substituted and the fifth goal was scored. During the first 75 minutes, Liverpool attempted 323 passes with a success rate of 74% and a 45% share of possession. After this, Liverpool attempted 163 passes with an accuracy of 96% and a 60% share of possession. Liverpool attempted 34% of their passes in this closing period. Let’s see how this looks in terms of their passing network.

The positions of the players are loosely based on the formations played, although some creative license is employed for clarity. It is important to note that these are fixed positions, which will not always be representative of where a player passed/received the ball. The starting eleven is shown on the pitch for the first 75 minutes, with Borini replacing Gerrard in the second network.

Passing networks for Liverpool for the first and second halfs against Swansea City from the match at Anfield on the 17th February 2013. Only completed passes are shown. Darker and thicker arrows indicate more passes between each player. The player markers are sized according to their passing influence, the larger the marker, the greater their influence. Players with an * next to their name were substituted. Click on the image for a larger view.

Passing networks for Liverpool for the first 75 minutes and up to full time against Newcastle United from the match at St James’ Park on the 27th April 2013. Only completed passes are shown. Darker and thicker arrows indicate more passes between each player. The player markers are sized according to their passing influence, the larger the marker, the greater their influence. Click on the image for a larger view.

Liverpool’s passing was quite balanced for the first 75 minutes of the match, with a varied passing distribution. There was a stronger bias towards the right flank compared with the left flank as Gerrard drifted right to combine with Johnson and Downing. The passing influence scores were also evenly distributed across the whole team with Gerrard and Lucas being the top two. A contrast with some previous matches is the lack of strong links along the back line, which indicates less reliance on recycling of possession in deeper areas. Instead, Liverpool were seeking to move the ball forward more quickly and played the ball through the whole team.

He makes us happy

After Gerrard and Lucas, the next most influential player was Coutinho, who put in a wonderfully creative performance as the attacking fulcrum of the team. He linked well with all of Liverpool’s forward players and threaded several dangerous passes to his team-mates including an assist and a ‘second goal assist’ (defined as a pass to the goal assist creator) for the second goal according to EPL-Index. His creative exploits thus far have been hugely promising during his first 10 appearances.

Sterile domination

The final period of the match saw Liverpool really rack up the passing numbers as mentioned earlier. Clearly, this is easier to do when 5 or 6 goals clear but it is still potentially illustrative to see how this was accomplished. The main orchestrator’s of this were Lucas and Henderson who were 28/28 and 35/35 for passes attempted/completed during this period. Henderson was 21/24 from the first 75 minutes, so this was quite a rapid increase with his shift in role after Gerrard went off and the state of the game.

Your challenge should you wish to accept it

Admittedly Newcastle were very poor in this match but Liverpool took advantage to enact a severe thrashing. This was accomplished without Suárez, which leads to obvious (premature?) questions about whether his absence improved Liverpool’s overall balance and play. Assuming that Suárez doesn’t leave in the summer, one of Bredan Rodgers’ key tasks will be developing a system that gets the best out of the attacking talents of Suárez, Coutinho and Sturridge. It could be quite tasty if he manages to accomplish this.

Liverpool vs Zenit St Petersburg: passing network analysis

Liverpool beat Zenit 3-1 at Anfield but went out of the Europa League on away goals. Below is the passing network analysis for Liverpool for both the first hour and the final 30 minutes of the match. This coincides with Liverpool’s sumptuous third goal and the double substitution that saw Assaidi and Shelvey replace Henderson and Allen. More information on how these passing networks are put together is available here in my previous posts on this subject.

The positions of the players are loosely based on the formations played by the two teams, although some creative license is employed for clarity. It is important to note that these are fixed positions, which will not always be representative of where a player passed/received the ball. The starting eleven is shown on the pitch for the first hour, with the substitutes shown for the final 30 minutes. Sterling was only on the pitch for a brief period so I’ve omitted him from the second network.

Passing networks for Liverpool for the first and second halfs against Swansea City from the match at Anfield on the 17th February 2013. Only completed passes are shown. Darker and thicker arrows indicate more passes between each player. The player markers are sized according to their passing influence, the larger the marker, the greater their influence. Players with an * next to their name were substituted. Click on the image for a larger view.

Passing networks for Liverpool for the first 60 minutes and final 30 minutes of the match against Zenit St Petersburg from the match at Anfield on the 21st February 2013. Only completed passes are shown. Darker and thicker arrows indicate more passes between each player. The player markers are sized according to their passing influence, the larger the marker, the greater their influence. Players with an * next to their name were substituted. Click on the image for a larger view.

Liverpool’s initial selection circulated possession well within the midfield zone, which is perhaps unsurprising given how possession friendly the midfield was. Compared with Coutinho and Suárez in the match against Swansea, Henderson and Allen primarily look to maintain possession rather than being more direct with their approach play. This meant that Liverpool dominated possession and kept Zenit pinned back in their half generally. Enrique and Johnson were also heavily involved and provided a great deal of width. At the hub of Liverpool’s play was Lucas who knitted things together superbly and combined effectively with all of his team mates.

Zenit did generally defend very well though and Liverpool struggled to create particularly incisive moves, although Allen’s goal was the result of excellent interplay between Henderson and Enrique (the strongest passing link in the first hour). Two set-piece goals from Suárez though set the platform for a potentially memorable comeback after Zenit’s away goal.

Anything could happen in the next half hour

Liverpool’s double substitution after the third goal saw two more direct attacking threats joining the fray as the side looked for a potential tie-winning goal. However, looking at the passing network for the last half hour, Liverpool struggled to bring their attacking players into the game. Liverpool shot frequency actually declined in this period with a succession of crosses from both open-play and set-pieces being delivered into the box. Zenit defended particularly well during this period and maintained possession for short periods to stem the tide of Liverpool attacks. They also pressed high up the pitch which saw some nervous moments in the crowd as well as the odd passage on the pitch! While the changes likely didn’t help Liverpool to any great extent, chances were still created that could have won the tie plus Zenit also boxed clever while often under a lot of pressure.

Over and out

Unfortunately Liverpool weren’t able to score that crucial fourth goal in the final 30 minutes that could have seen them go through. On a personal note, it was a privilege to be a part of a fantastic atmosphere at Anfield, which nearly saw an improbable comeback to add to Liverpool Football Club’s folklore.

Liverpool vs West Bromwich Albion: passing network analysis

Liverpool lost to West Bromwich Albion 2-0 at Anfield. Below is the passing network analysis for Liverpool and West Brom. More information on how these are put together is available here in my previous posts on this subject.

The positions of the players are loosely based on the formations played by the two teams, although some creative license is employed for clarity. It is important to note that these are fixed positions, which will not always be representative of where a player passed/received the ball. Only the starting eleven is shown on the pitch, as the substitutes weren’t hugely interesting from a passing perspective in this instance.

Passing network for Manchester City and Liverpool from the match at the Etihad on the 3rd February 2013. Only completed passes are shown. Darker and thicker arrows indicate more passes between each player. The player markers are sized according to their passing influence, the larger the marker, the greater their influence. The size and colour of the markers is relative to the players on their own team i.e. they are on different scales for each team. Only the starting eleven is shown. Players with an * next to their name were substituted. Click on the image for a larger view.

Passing network for Liverpool and West Brom from the match at Anfield on the 11th February 2013. Only completed passes are shown. Darker and thicker arrows indicate more passes between each player. The player markers are sized according to their passing influence, the larger the marker, the greater their influence. The size and colour of the markers is relative to the players on their own team i.e. they are on different scales for each team. The player markers are coloured by the number of times they lost possession during the match, with darker colours indicating more losses. Only the starting eleven is shown. Players with an * next to their name were substituted. Click on the image for a larger view.

There are some contrasting features between the two sides here. Liverpool’s standard recycling of possession in deeper areas is evident, with interplay between Reina, the back four and the midfield two of Lucas and Gerrard. West Brom showed some similar features, although the link between their centre backs is much weaker than the link between Agger and Carragher.

Mulumbu and Morrison were impressive for West Brom, linking well with the players around them. They formed some nice triangular passing structures with those around them, particularly with their midfield partner Yacob. Based on their passing network, West Brom passed the ball around well when they had it although Long wasn’t hugely involved (he did provide his usual nuisance value though).

One of the major differences is how both sides involved their respective centre forwards. Long generally either received the ball from deeper areas e.g. the long link between himself and Foster (although many of the passes were unsuccessful) or by linking up with Morrison, who was typically the most advanced of West Brom’s central midfielders. In contrast, the link between Shelvey and Suárez is almost non-existent. Given that these two were ostensibly Liverpool’s two most attacking players, the lack of interplay between them was disappointing.

Ineffectual width

With Henderson and Downing continuing on their “unnatural” sides, Liverpool’s fullbacks had plenty of space to move into down the flanks. This meant they were often a natural passing outlet for their team mates and this is highlighted by the high passing influence scores they both received. Unfortunately, much of the attacking impetus that Enrique and Johnson provided was highly wasteful. As noted on the Oh you beauty blog, their pass completion in the final third was woeful. Between them, Enrique and Johnson accounted for 30% of Liverpool’s total losses of possession. Enrique misplaced 9 passes within his own half also, as noted by WhoScored. Generally I’ve interpreted a higher passing influence score as being a good thing but perhaps in this instance this wasn’t the case.

That is why we like him

Aside from Enrique and Johnson, the main passing influence for Liverpool was Lucas. Lucas’ absolute and relative passing influence within in the team has been steadily increasing over recent matches, which is encouraging as he recovers from his injury issues. Unfortunately for Liverpool, Gerrard, Henderson and Downing had less influence than in recent weeks, which alongside the lack of partnership between Shelvey and Suárez, went some way to Liverpool struggling to open up West Brom.

Manchester City vs Liverpool: passing network analysis

Manchester City drew 2-2 with Liverpool at the Etihad. Below is the passing network analysis for Manchester City and Liverpool. More information on how these are put together is available here in my previous posts on this subject.

The positions of the players are loosely based on the formations played by the two teams, although some creative license is employed for clarity. It is important to note that these are fixed positions, which will not always be representative of where a player passed/received the ball. Only the starting eleven is shown on the pitch, as the substitutes weren’t hugely interesting from a passing perspective in this instance.

Passing network for Manchester City and Liverpool from the match at the Etihad on the 3rd February 2013. Only completed passes are shown. Darker and thicker arrows indicate more passes between each player. The player markers are sized according to their passing influence, the larger the marker, the greater their influence. The size and colour of the markers is relative to the players on their own team i.e. they are on different scales for each team. Only the starting eleven is shown. Players with an * next to their name were substituted. Click on the image for a larger view.

Passing network for Manchester City and Liverpool from the match at the Etihad on the 3rd February 2013. Only completed passes are shown. Darker and thicker arrows indicate more passes between each player. The player markers are sized according to their passing influence, the larger the marker, the greater their influence. The size and colour of the markers is relative to the players on their own team i.e. they are on different scales for each team. Only the starting eleven is shown. Players with an * next to their name were substituted. Click on the image for a larger view.

In the reverse fixture, Yaya Touré and De Jong were very influential for City but Touré was away at the African Cup of Nations, while De Jong joined Milan shortly after that fixture. Their replacements in this game, Barry and Garcia, were less influential, although Barry had the strongest passing influence for City in this match, with Milner second. The central midfield two, Lucas and Gerrard, were very influential for Liverpool and strongly dictated the passing patterns of the team. They both linked well with the fullbacks and wider players, while Lucas also had strong links with Suárez and Sturridge. Certainly in this area of the pitch, Liverpool had the upper hand over City and this provided a solid base for Liverpool in the match.

No Silva lining

Something that Liverpool did particularly well was limit the involvement of David Silva, who posted his worst pass completion rate (73% via EPL-Index) this season. Usually, Silva completes a pass every 96 seconds this season, whereas against Liverpool it was every 162 seconds. While Mancini’s tactical change did bring Silva more into the game briefly, overall it had a negligible impact upon Silva’s influence when comparing the networks before and after the substitution. However, one of the few occasions where Silva was able to find some time and space, he combined well with James Milner to help create City’s first goal. Goes to show it is difficult to keep good players quiet for a whole match.

Moving forward

Similarly to the Arsenal game, Liverpool showed less of an emphasis upon recycling the ball in deeper areas. Instead, they favoured moving the ball forward more directly, with Enrique often being an outlet for this via Reina and Agger. Liverpool’s fullbacks combined well with their respective wide-players, while also being strong options for Lucas and Gerrard. Strurridge was generally excellent in this match and was more influential in terms of passing than in his previous games against Norwich and Arsenal, combining well with Suárez, Lucas and Gerrard.

At least based on the past few games, Liverpool have shown the ability to alter their passing approach with a heavily possession orientated game against Norwich, followed up by more direct counter-attacking performances against Arsenal and Manchester City. The game against City was particularly impressive as this was mixed in with some good control in midfield via Lucas and Gerrard, which was absent against Arsenal. How this progresses during Liverpool’s next run of fixtures will be something to look out for.