The Numbers Game | How Data Is Changing Football | Documentary


Clubs at every level of the football pyramid are becoming smarter and more efficient. How? The use of data. To hell with conventional wisdom. The way we’ve been doing it, it’s not been working. Analysis are now recording data from thousands of actions during games and training sessions to help shape pre-match preparation and post-game debriefs, pinpoint transfer targets and develop young talent. The genie is out of the bottle and I don’t think it’s going back in. We may know more about the opposition than they actually know about themselves. The growing use of analytics in football has attracted criticism and cynicism. These are athletes – they’re not spreadsheets. Battle lines have been draw between the analysis and the traditionalists. Can football be translated into numbers by data bods? Or does it require special insight from real football men? In 2002 one of the most unfashionable teams in Major League Baseball – the Oakland Athletics – defied the odds to go on a record-breaking 20-game winning streak. Their success was powered by a new approach to player recruitment: sabermetrics. What started out as… I had played 10 years professionally, so when I stopped playing I heard the front office.
I started reading this stuff – again, the baseball academics. It made sense to me and
I had my own experience with which to look at both sides. I came from a traditional baseball background as a player and I was reading this new stuff that sort of put player performance in order for me. It was very rational. I could see why a baseball team was good. You could
look at numbers and explain why they were good instead of looking at things anecdotally and trying to use non-quantifiable reasons to apply success. We were one of the smallest teams in the league – we were actually losing money – but it also created a great platform. It meant
that if we just did things the same way the New York Yankees – aka the Manchester United did – we were destined to finish where our player wages said we should. If you had the
lowest payroll, you were probably going to finish in last. We had the opportunity because we had nothing to lose, to implement something differently. The success of the Oakland A’s encouraged sports teams around the world to replicate the model pioneered by Billy Beane. Early adopters believed the Moneyball approach could give them an advantage over their competitors. We knew it worked on individual players and were able to apply it to a whole team. We won four division titles… three division titles and a wildcard were we averaged almost 96/97 wins per year so we had
immediate success. The biggest thing, the most important thing was we understood why
we were successful and we understood where we went wrong, I mean the numbers would show us. Billy Beane had this huge luxury of not looking at relegation. If you don’t have to look at relegation, you can try all kinds of stuff. Analytics and big data are driving the strategies
of major cooperations around the world and these methods are now filtering into football.
From the boardroom to the boot room. Football clubs over the last 10-15 years have had to deal
with a technological revolution. What that’s meant is they’ve now started to collect
through third party vendors, lots and lots of data on football. That data was primarily collected for fans and media outlets to use. They’ve made their way into the clubs themselves and now you have football departments that have to contend with an avalanche of information. Sports data is basically a reconstruction of the match. Why do we collect data? It’s basically so we can tell a story of how the match is played. So you can look through it through in various lenses. So, you can have event data – in how many passes and shots, but as we know
in football, it’s not a great reconstruction. But if we have the tracking data – so if we can
see the dots run around – we can basically reconstruct the game in a better way. It’s like having a scout at every game and not just having a scout at every game, because
we’re collecting data on everything the player is doing on the field. It’s like having a scout for every player in every game because everything they do is recorded. Now it’s not so much about collecting the data. It’s making sense of that data. The stakes are high at the top of the footballing
pyramid but lower down, one bad season can have catastrophic repercussions. Small clubs
with limited budgets can’t afford to make a mistake. To reduce the risk of requiring
a dud signing, they’re turning to Beane’s sophisticated sabermetric approach. I like to try and get to the training ground as often as I can and help out with the guys down there. But a lot of time I’m based here in Ecotricity with a four-screen set up. I’m surrounded by a lot of energy traders and at times, there’s million pound deals getting made and I’m sat here watching League Two football and providing analysis so it’s a pretty unusual work place. Probably quite different to a lot of analysts in the Football League, but it’s good. So, here at Ecotricity I’m the
Chief Operating Officer. In energy trading, we buy and sell energy – mostly buy – to meet the needs of our customers on a day-to-day basis. We’re able to take a lot of the skills and the data analysis that we undertake in the trading front of energy into the world of football. So we saw it as an opportunity to be creative in the data and analytical space and see if we can form a competitive advantage at a lower
level. It wasn’t really necessarily about budgets, but it was about trying to maximise what we can get out
of every single player that we recruit, trying to bring together a list of players that is the best from the manager’s eye and also augmenting that with the data and also performance wise, we wanted to understand all aspects of our performance. So essentially we’re doing the same thing in business, and taking that into the world of football. The Billy Beane story is originally a story about player recruitment and finding inefficiencies in the market on
the back of going against conventional wisdom. They used data to try and scout players, try
to find players that no one else wanted that were able to do things that would help the
team win. Manchester United and Burnley are very different clubs despite the fact that they play in the same league and as a result Burnley has to take a very different approach to putting together a team than Manchester United. There’s a lot of money being spent, but for more the mid-level clubs, there should be bargains available. If they’re smart with
the data, if they look through it with a certain lense, they could be able to find some gems out there. Yeah, this is where all your goals are coming from. So, a lot of them are in the six-yard box. If we get the ball to you in there then, that’s sweet. That’s my bread and butter. The recruitment side for a small club is really key and it’s important that we’re different. In January, every club will be after the same players and we probably
can’t compete for those players that everyone is after so we have to find other types of
players and we have a different way of playing. We have to find players who can fit into that
and we have to use the data for that. The one I would definitely pick out is Christian
Doidge. He’s been our top scorer last year, he’s our top scorer this year. I think he’s the
second or third highest goalscorer in the top six English leagues in 2017. I think Christian has done better than we envisaged, but we knew that the basics were there. We knew he could score goals, we knew he got in the right positions on the pitch because his data showed that. It was then a case of us trying to work with him – how to convert those chances from the positions
he got into which his data showed. That one is proof of the data works. I mean the value
for money on that one – we paid £30,000 for him and he’s worth an awful lot more than
that now. Tom who looks after the data… I’ll give him a list of targets and he’ll go through them and give us graphs in terms of their value and what they’re good at, what they’re not good at, what their metrics are in terms of – if it’s a striker, expected goals. As analytics evolve, new metrics arrive and some are more widely accepted than others. Expected goals is one example of such a seemingly decisive tool. So what exactly does it mean? As a measuring tool of the probability of that shot, from that specific location and resulting in a goal. We look at thousands of different shots, that occurred in League One and League Two and National League so we make it relevant to our level of football. We’ll then apply where it was on the pitch, the angle, the distance, was it a headed shot? Was it a shot
with the feet? How was it assisted? Put all those things into an algorithm and that will then produce a number which will tell us how likely that is to result in a goal. If the expected goal is 0.15, 15% of the time, a shot from that location will result in a goal. It makes me feel a lot better about myself because my expected goals is a lot less than
what I’m achieving at the moment. That’s good for me. I just think that football is changing
and any little inch you can get, it helps out massively. It might be the difference at the end of the season between getting promoted or relegated. I had nine games without a goal this season and the manager pulled me and said, “Listen. I know we’re having bad
results at the moment, but I don’t want you to get involved with stuff that you’re not
as good at. You’re best when you’re in the box and you stay between the width of the
goal. That’s where you score your goals.” I done that and I’ve gone on a little bit
of a goalscoring run so that’s where the stats have helped me and the manager. It tells
me where to run and what positions I should get myself into to help my game as much as possible and the team. Competing against the Premier League’s mega-rich requires creative thinking. To punch above their economic weight, Southampton created the black box: A live database collecting player metrics from every major league. This has enabled them to acquire players of undervalued talent and sell them on for a profit. Saido Mane, Dejan Lovren, Morgan Schneiderlin, Victor Wanyama – the list goes on. A lot of the KPIs that we look for in the different positions is something else which has been consistent for quite a
while. A lot of the scouts know the type of players that we’re looking for at the football
club. They will already be creating scout reports for any players that they’ve seen out there so they
can recommend them to put on our target list and someone that we need to look at as
a potential signing for the football club. We’ll also use the data on a global scale to highlight any top performers and from that, will be an area that we need to provide some more scouting information on so that’ll be from the eye from our scouts. Yes, there are some players that will have been signed because their stats look good. Payet at West Ham is a good example. Gabriel at Arsenal was a good example of that kind of an approach, but that’s really kind of missing the point. The point of analytics is doing things differently. One of the reasons for these crazy prices that we’re paying for players these days is that people get really wedded to one player. They think this is the guy we need to have him and we’re willing to pay over the odds. What data can do is help you generate options that maybe find guys that are kind of like that other guy or maybe who would fit into the team in a slightly different way. It allows you to walk away from a bad deal. It allows you to walk away from a really expensive deal. Football has actually been collecting the most data for the longest time, but football is the most complex sport. So it’s low scoring, it’s continuous, it’s time varying, it’s very strategic. It’s very subjective. Just say you and I were analysing a game, we could come up with different opinions. When you compare it to other sports, like basketball – it’s high-scoring – tennis
and American football, they’re segmented. Baseball is segmented. It’s very easy to do
the analysis, you have a lot of data points. The key for football is actually to
come up with the right language and ask the right questions for specific things. How is
our formation? How did we press? How were we on set pieces? Did we attack via the counter-attack? All these different things, we have to learn directly from data. When I played it was a video recorder and looking at the game back now we monitor them every day and in terms of their sleep, their training, everything they do – it’s massive. We may know more about the opposition than they actually know about themselves. I think the coaches can see a certain amount
of what the data does and it backs that up. We can look at data of the team we’re about
to play and we can break down the strengths and weaknesses of the team that we’re playing. There was a game a few weeks ago, a game that we actually went onto win. In my opponent report, I noticed that the team played pretty deep, their average position was quite deep and their pressing metrics weren’t very high. So they allowed you a lot of time on the ball. I suggested that we would be able to play
a lot of football and we did. We sort of passed them to death. I’d also highlighted
an area where they were weak and they conceded a lot of shots. I said: “If we can
get our key players in these areas, there’s a fair chance we can score from here.” We actually scored our first goal in exactly that area. Data in terms of pre-match, a lot of it is video based, but in terms of statistical data it’s used to look at trends so it won’t be just from one game – it will be from game-to-game so we’ll build a database to create a performance
profile on that team and look at any individuals that are performing to a higher level. The black box also helps Southampton develop homegrown talent that they can sell for huge profits. Data helps drive player recruitment at academy level and to maximise the potential of their
scholars. I started training when I was eight and then I finally signed at nine so quite
a young age. I think when we first got here, it was just a load of numbers on a sheet, but now we understand what it actually is, the details of it and where we can improve and what we need to look at. It’s helped me massively. When I first got here, I didn’t know what to do with it – just watching the game, I wasn’t really taking notice, but as I had to learn more, I think I focused on myself more and the positioning I’m taking up and all the little details – you can sort of figure out what you have to do to be better. It’s helped me massively develop. A founding principle in this organisation is youth development. It’s everything we stand for – excellence, potential. It’s the strapline, it’s everything you work
towards. Even when you buy a senior player, the principles are still the same. Can we improve him? Because we maybe selling him and if we are selling him, we need to be selling him for
a profit. It’s all about improving that individual. It was never really the dream to produce a
player to sell – it became the business model when the first team started sliding through the leagues and ultimately into administration – it was selling of players: Theo Walcotts, Alex Oxlade-Chamberlains, Gareth Bale. We all, as fans and also as a staff member here, we all dream of, ‘What would have happened if we’d kept hold of those players?’ The reality is, if we’d kept hold of those players, we would have gone out of business. There’s a huge amount of data that is collected around the players from matchday data to the way they sleep, to the way they’re feeling in the morning,
to their training power outputs in the gym. The challenge is what we do with that data
and how important is it? The analytics around that data. On a daily basis we collect information
about the players from GPS units. We would look at distances covered, the speeds at which
they’re covered and other information such as accelerations and decelerations. We would use that in a more individualised approach. We can then optimally adjust their training
programs to make sure that they’re fresh and they’re in peak condition come matchday. We’re now in an amazing position where for the first time we’re able to turn down those opportunities to sell players and push back against the big clubs and turn around and say, ‘No, not for sale.’ It’s a huge point in the game now. There’s obviously a lot of other sports which use data or have the analytics. Soccer has not yet cracked the code yet in terms of what are the key indicators of what is going to make a player successful or not? I think there are several
companies out there that aggregate the data and try to make it easier for you to make
a decision, but at the end of the day, I think soccer people still want to see the player
and see how that marries up with the data that you’re seeing because sometimes the data doesn’t
always match with what you’re seeing on the field because of the free-flowing nature of
the game and the fluidity of the sport. I think the mentality of a player, I think that sometimes the soccer
IQ and you’re only going to get from seeing sometimes live, obviously video as well, but also sitting down with that player and having a conversation with them about
the game itself, about his particular skill set, about your own club’s philosophy on the
game and see if there’s a match there. You can’t get answers from that with data. Analytics has come a long way from pass competition rates and heat maps. Some of the brightest minds in the game want to find an algorithm to calculate the most valuable intangibles: like team chemistry. What will this mean for the future of football? All goals aren’t created equal and the ability to weigh the difficultly of those goals – the player with the skill set to do those things should be rewarded as opposed to a guy who maybe just tapped one in because Suarez drew three defenders on him, penetrating and he flipped it off to him and the other guy just taps it in. Well the goal gets paid for in today’s world, but shouldn’t the guy who created all those things? And measuring those things is really the challenge and giving proper credit to player performance is what we’re all trying to achieve, not just in baseball, but in
every sport – just like in business. There’s lots of cool stuff that people haven’t
thought about. The idea of ghosting, being able to simulate plays that you haven’t seen
before. You can have an example of a play and you can say, ‘Well, how does this team defend in that situation? What happens if I switch that player with another player? How does the outcome change? In terms of body shape, where is the player is facing? Are they making the right decisions? In terms of injury analytics – player load, fatigue, how’s their technique changing over time? Now, using deep neural networks we can actually simulate these things. I think in terms of injury prediction, I think you’ll find there will be less injuries – so there will be less soft tissue injuries. Soft tissue injuries, I think that will be minimised. I think in terms of player valuation, in terms of performance, I think that will be normalised. I think we see the volatility now is because we haven’t got these good metrics. However, what you don’t take into consideration is the media… … the shirt sales – there’s all these other things that need to be taken into account. You’re never just going to have data just making a sole decision in anything, but as data advances and the individuals that are part of that process are creating and maximising the use of data in clubs and in different sports, I think those people are more crucial in the process and I think data becomes more important in what we do from day-to-day. We have to communicate with domain experts (players and coaches) and if we can’t speak their language, then we’re basically not going to be let in. It’s an exciting area to be in because it’s constantly evolving and improving, as technology improves. The genie is out the bottle and
I don’t think it’s going back in. When you’ve got open-minded people, it works really well. Hopefully it can tell us if we’re going to win or lose, that would be nice wouldn’t it? If today it can tell me if we’re going to get three points on a Saturday it would save me an awful lot of work.

22 thoughts on “The Numbers Game | How Data Is Changing Football | Documentary

  1. Hey! I've watched all your videos and i just want to say a big thanks for helping us beginners out there. Your channel is amazing and you make it a lot easy for us to understand and learn the game properly.. I'd really appreciate it if you could do a video series on how to become a better player. Wish you a great new year and I really hope that you'll consider my request faithfully…

  2. I wonder if soccer will take to analyitics. Other sports already have, namely Baseball and Hockey.
    Soccer seems like it's an old boys sport. Change rarely comes. Rules rarely change. Video replay is only in it's infancy.

    I wonder what will be the first major change that begins a copy cat mentality.

  3. If we can win millions of pounds in financial markets using data we can do again in football. But may be we've got a problem if we couldn't win these millions of pounds in financial markets. ¿?

  4. My dream job – being an analyst for Tottenham. This documentary is amazing. I got my degree in economics because of Moneyball, Billy Beane is the man.

  5. “Small clubs with limited budgets”, shows Forest Green who had probably the biggest budget ever seen in the conference and lost millions each year which would send most clubs at that level out of business

  6. Great documentary! So there is something behind the team. There are those intelligent people who work behind the club. Football nowdays is more than just a competition to win titles, but it is also a "Data Analytics War" among the clubs. Interesting.

  7. Ciao a tutti, qui parlo di football analytics: https://www.youtube.com/watch?v=InDZHrWkCTk&feature=youtu.be
    Può sicuramente essere utile ma usarli con cautela!

    dite che ne pensate

    Ciao

    Grazie

Leave a Reply

Your email address will not be published. Required fields are marked *