I’ve been closely following the developments in the football analytics community for close to two years now, ever since WhoScored allied themselves with the Daily Mail and suggested Xavi wasn’t so good at the football and I was directed to James Grayson’s wonderful riposte.
There has been some discussion on Twitter about the state of football analytics recently and I thought I would commit some extended thoughts on this topic to writing.
Has football analytics stalled?
Part of the Soccermetrics podcast, featuring Howard Hamilton and Zach Slaton, revolved around how football analytics as an activity has “stalled” (Howard has since attributed the “stalled” statement to Chris Anderson, although he seemingly agrees with it). Even though this wasn’t really defined, I find it difficult to comprehend the view that analytics has stalled.
Over the past two years, the community has developed a lot as far as I can see. James Grayson and Mark Taylor continue to regularly publish smart work, while new bloggers have emerged also. The StatsBomb website has brought together a great collection of analysts and thinkers on the game and they appear to be gaining traction outside of the analytics echo chamber.
In addition to this, data is increasingly finding a place in the mainstream media; Zach Slaton writes at Forbes, Sean Ingle is regularly putting numbers into his Guardian columns and there is a collection of writers contributing to the Dutch newspaper De Volkskrant. Mike Goodman is doing some fantastic work at Grantland; his piece on Manchester United this season is an all too rare example of genuine insight in the wider football media. The Numbers Game book by Chris Anderson and David Sally was also very well received.
Allied to these writing developments, a number of analytics bloggers have joined professional clubs or data organisations recently – surely it is encouraging to see smart people being brought into these environments? (One side effect of this is that some great work is lost from the public sphere though e.g. the StatDNA blog).
To me, this all seems like progress on a number of fronts.
What are we trying to achieve?
The thing that isn’t clear to me is what people in the analytics community are actually aiming for. Some are showcasing their work with the aim of getting a job in the football industry, some are hoping to make some money, while others are doing it as a hobby (*waves hand*). Whatever the motivation, the work coming out of the community is providing insights, context and discussion points and there is an audience for it even if it is considered quite niche.
Football analytics is still in its infancy and expecting widespread acceptance in the wider football community at this stage is perhaps overly ambitious. However, strides are being made; tv coverage has started looking more at shot counts over a season and heat maps of touches have made a few appearances. These are small steps undoubtedly but I doubt there is much demand for scatter plots, linear regression and statistical significance tests from tv producers. Simple and accessible tables or metrics that can be overlaid on an image of a football pitch seem to go down well with a broader audience – the great work being done on shot locations seems ripe for this as it is accessible and intuitive without resorting to complex statistical language.
However, I don’t think the media should be the be all and end all for judging the success or progress of football analytics. Fan discussion of football is increasingly found online in the form of blogs, forums and Twitter, so the media don’t have to be the gatekeepers to analytics content. Saying that, I would love to see more intelligent discussion of football in the media and I feel that analytics is well placed to contribute to that. I’d be interested to hear what it is people in the football analytics community are aiming for in the longer term.
What about the clubs?
The obvious aspect of the analytics community that I’ve omitted from the discussion so far is the role of the clubs in all this. It’s difficult to know what goes on within clubs due to their secrecy. The general impression I get is that there are analytics teams toiling away but without necessarily making an impact in decision making at the club, whether that is in terms of team analysis or in the transfer market. Manchester City are one example of a team using data for such things based on this article.
With this in mind, I was interested to listen to the Sky Sports panel discussion show featuring Chris Anderson, Damien Commoli and Sam Allardyce. Chris co-authored the excellent The Numbers Game book and brought some nuance and genuine insight to the discussion. Commoli is mates with Billy Beane. Allardyce is held up as an acolyte for football analytics at the managerial level in English football and I think this is first time I’ve really heard him speak about it. I wasn’t impressed.
Allardyce clearly takes an interest in the numbers side of the game and reeled off plenty of figures, which on the surface seemed impressive. He seemingly revels in the idea that he is some sort of visionary with his interest in analytics, repeating on several occasions how he has been using data for over ten years. He seemed particularly pleased with the “discovery” of how many clean sheets, goals and other aspects of the game were required to gain a certain number of points in the Premier League; something that many analysts could work out in their lunch hour given the appropriate data.
I would question how this analysis and many of the other nuggets he threw out are actually actionable though; much of this is just stamp-collecting and doesn’t really move things forward in terms of actually identifying what is happening at the process level on the pitch. For example, Commoli’s statistic on a team never losing when having ten or more shots on goal, which is valuable information for those footballers who don’t aim for the goal. Now it could be that they were holding back the good stuff but several of their comments suggested they don’t really understand core analytics concepts such as regression to the mean and the importance of sample size e.g. referring to Aaron Ramsey’s unsustainable early-season scoring run. I would have expected more from people purporting to be leading take up of analytics at club level.
I felt Allardyce’s comment about his “experience” being better than “maths” when discussing the relationship between money and success betrayed his actual regard for the numbers side of football. Many of the numbers he quoted seemed to be used to confirm his own ideas about the game. This is fine but I think to genuinely gain an edge using analytics, you need to see where the data takes you and make it actionable. This is hard and is something that the analytics community could do better (Paul Riley is doing great work on his blog in this area for goalkeepers). The context that analytics provides is very valuable but without identifying “why” certain patterns are observed, it is difficult to alter the process on the field.
Based on the points that Allardyce made, I have my doubts whether the clubs are any further ahead in this regard than what is done publicly by the online analytics community. If there is a place where analytics has stagnated, maybe it is the within the clubs. To my mind, they would do well to look at what is going on in the wider community and try to tap into that more.
Shorter version of this post, courtesy of Edward Monkton.
Where are we Going?
I don’t know, I thought you knew.
No I don’t know. Maybe he knows.
No, He definitely doesn’t know.
Maybe no-one knows.
Oh Well. I hope it’s nice when we get there.