MY name’s Andrew, and I’m a statto, writes Andrew Beasley.
It’s not always easy admitting that to football fans. The mainstream stat sites routinely churn out tweets featuring numbers without context that tell you little or nothing. The CIES Football Observatory, whoever they are, use stats to tell you that Dejan Lovren is the best centre-back in the Premier League; Saturday’s dreadful defending against Swansea, which is hardly an isolated case, firmly suggests otherwise.
Damien Comolli built Liverpool’s attack by looking at stats, and then the Reds posted their sixth worst goals-per-league-game average in their history.
Yep, a lot of football fans hold a very dim view of stats, and thanks to stuff like this, which give them a bad name, it’s hardly a surprise.
And despite my confession at the top of the page, I’m not here to particularly try to convince you otherwise. If you’re not interested in how many tackles per game Lucas Leiva averages, or how many chances Adam Lallana creates when he plays in midfield compared to in the front three, then nothing I can say here is likely to change your mind.
What I do want to talk about though is the notion that they can’t tell us anything at all about football. I think there’s misconceptions out there about which stats matter, what they can tell us, and there’s definitely confusion about what certain stats actually mean in the first place.
The inspiration for this article came from a recent TAW Player Ban This Filth show, when one of the categories was regarding a stat you’d like to consign to history.
As there was no statto on the panel, I thought it might be worth me throwing my two pence in here.
The first stat that was up for debate was passing accuracy. To my mind, this doesn’t tell us that much about a player’s performance in a match. Exhibit A:
Joel Matip's game by numbers vs. Chelsea:
100% aerial duels won
95% pass accuracy
3 tackles won
— Squawka Football (@Squawka) September 16, 2016
Joel Matip completed 95 per cent of his passes at Chelsea; sounds impressive, no? A closer look reveals that 41 of his 43 passes were short though, with 15 of them being sideways or backwards.
In other words, I’d be annoyed if he hadn’t completed about 95 per cent of his passes. This is in no way a pop at Matip, but it shows how in isolation pass accuracy often doesn’t mean a great deal (and as an aside, he won 100 per cent of the one aerial duel he contested; remember how I said sites can post numbers that mean nothing?).
And yet on a team level I think pass accuracy can be quite a revealing stat, especially if the right context is applied.
Let me give you an example. Everton had a pass accuracy of 63 per cent against Liverpool at Goodison last month. That alone doesn’t tell us much, but if you know that this is 15 per cent below their average for the season, and they completed around 150 fewer passes than normal, then you learn a bit more. You can surmise that they struggled to get out from the back, that they couldn’t cope with Liverpool’s pressure, and they were hurried and harried into poor passing.
You might respond, “I could see that with my eyes, why would I need stats to tell me that?” and that’s an entirely valid response, though as Sean Rogers often says on The Tuesday Review, you can still judge with your eyes but then check the data to see if your reading of the match was correct or not.
It’s a similar tale for the next stat up for binning: expected goals (or ‘xG’ for short).
If you’re not aware of what that is, each shot a team takes and concedes is assigned a value based on the historic likelihood of it being scored, the figures are totalled up, and you then get a value for their attacking and defensive efforts.
Let’s say you didn’t watch Liverpool’s recent defeat to Swansea. Wishful thinking, eh? You might read that the Reds had 16 shots to The Swans’ six, with a count of 5-3 on target, and assume that Klopp’s men deserved to win a match they lost.
However, Exhibit B:
xG map for Liverpool – Swansea. Strong Swansea performance in a match with some great finishing. pic.twitter.com/cc8S2xaVKT
— Caley Graphics (@Caley_graphics) January 21, 2017
Swansea might have had a lot fewer shots than Liverpool, but they were of sufficient quality that according to expected goals they deserved to win. Again, “I saw this myself, poindexter” is fair comment, but how about the several hundred Premier League matches you haven’t watched this season — who was the better team in those?
You could watch Match Of The Day to get some idea, but as Liverpool Twitter usually says they don’t show a fair representation of our match so why would you assume they do for everyone else?
This for me is the main selling point of football stats. They should act as a shortcut to learning about teams and players you can’t hope to watch every minute of.
A lot of fans will watch a YouTube compilation of a player when they’re linked with a transfer, and as it shows you things that the public stats can’t (pace, positioning, ability on the ball in tight spaces etc) then it’s not without merit. But equally you’re only watching five minutes out of the two thousand plus they’ve probably played that season.
Exhibit C, Alberto Moreno:
The highlight reel shows his three league goals in 2013-14, and many fans assumed he might net more when he joined a higher scoring team. Yet he scored three of his four shots on target that season, which is obviously not sustainable (as Messi scores around 40 per cent of his shots on target) and Moreno’s other 18 efforts were all blocked or off target; highlight reels understandably don’t tend to show these.
Three league goals in the three seasons since suggests that the stats gave a better analysis of his goalscoring prowess than a short video would, and I suspect that his expected goals figure was far lower than his actual output too.
The final ‘filth’ up for banning on the TAW Player subscription show was tackle stats.
This is a subject close to my heart, as I penned an article back in 2013 about how the official definition of tackles won or lost doesn’t really make sense.
In short, the easiest way to ‘win’ a tackle is to put the ball out of play, which is why the likes of Antonio Valencia (as Jay McKenna noted on the podcast) and other full backs top the charts for tackle win percentage as the game is loaded unfairly in their favour. I wouldn’t ban tackle stats, but I’d ban tackle win/loss stats as either way the defending player has dispossessed the ball from the attacker, and isn’t that the most important thing?
Much like passing stats, I doubt any serious stattos out there would use tackle stats to vouch for a player’s ability in any case though, and that’s the key part of my overall message here: sceptics will say “stats don’t prove anything”; stattos will counter with “we never said that they did”.
Personally, I welcome anything that helps increase my understanding of a game which I love but was not blessed with the ability to play (and I mean to any level whatsoever). Let’s not ban stats, just the poor and/or misguided usage of them.
Pics: David Rawcliffe-Propaganda Photo
Like The Anfield Wrap on Facebook
what’s the criteria for working out the historical likelihood of a goal being scored? That seems, at first glance, either so vague or wide ranging as to be inaccurate in the first place, therefore making the xG thing irrelevant or requires a database the size of Mars to work it out.
Take Sturridge spectacularly blazing over last night from 6 yards. The ball was slightly behind, in the air and required some adjustment from the player. Was that a difficult xG or simply a shit finish?
You’re on the right track when referring to “a database the size of Mars”. Michael Caley (whose tweet features in the article) has written a couple of articles about the method he uses (which can be found here: https://twitter.com/MC_of_A/status/704758347472441344) but the factors he uses include:
“where was the shot attempted from? What sort of pass assisted the shot? With what body part was the shot taken? Did the attacker dribble past his defender before trying the shot? How fast was the attacking move that led to the shot? Was the shot off a rebound or from a set play?”
It’s not a perfect system, but then equally there will probably never be one. I doubt it can take account of the scenario you mention below, in terms of the ball being slightly behind the player, for instance. To my mind it’s not supposed to be a definitive reading of how a game panned out, but a handy piece of shorthand to sum up how each team performed.
Thanks for this Andrew, I always enjoy your work and I think your point about analysis being useful retrospectively more than prospectively is a valid one.
Cheers, JC, glad you like it.
Nice article and great to see you popping up on here Beez!
Re. xG, I’ve always found it to be the best indicator of who ‘deserved’ to win for the reasons cited above. One big problem with it however is the fact that it relies on a shot having been taken… in the Swans game Phil had a great chance to shoot on his left but chose not to (lack of confidence on weaker side, thought teammate was in better position) and passed instead. That play wouldn’t have registered at all on Caley’s (excellent) charts. I believe this is also why Arsenal (and to a lesser extent Liverpool) annoy observers using xG as a proxy for “who deserved to win” with their tendency to want to walk the ball into the net.
There are xG systems out there which account for the sort of thing you’re talking about, and include more than just shots. However, they’re kept under wraps as people probably want to sell them to clubs! And this point applies to a lot of stats – what’s in the public domain and what actually exists are often very different things. Amateurs like myself are restricted to what the likes of Stats Zone share in their app, but tons more data will be collected.
Goals scored v goals conceded is the only stat I care about.