A Brazilian at the Football Innovation Conference

A Brazilian at the Football Innovation Conference

This past weekend, I traveled to London to attend the first StatsBomb Innovation in Football Conference, held at Chelsea’s Stamford Bridge. In case you are not aware, StatsBomb is one of the leading football data analysis companies, they provide consulting for clubs and act a a data provider, collecting data from 40 leagues around the world. This conference brought together several researchers, club analysts, companies and football fans who are interested in a more analytical and technological perspective of the game.

I admit that I came back from the conference with mixed feelings. On the one hand, I am very excited about the possibilities in terms of football analytics, the area is growing exponentially and we should see more and more clubs and companies exploring modern data science tools to better make better decisions. On the other hand, it is clear that the forefront of this area is centered in Europe (as well as North America) and it will be difficult (but not impossible) for South American countries, such as Brazil, to achieve the level of innovation and research seen in the old continent.

I couldn’t find any other Brazilians at the conference, I believe that almost all of the participants were from Europe or North America. If you happen to be Brazilian, or from any other country in South America, please contact us (send an email to algolritmoblog@gmail.com)! I would love to hear more people’s impressions. Also, if you work for a professional club, football project or are more of a football analytics enthusiast, feel free to contact us to chat as well.

I would like to congratulate the event organizers, everything went very well and was extremely pleased with the quality of the presentations, venue and food, it did not seem like it was the first time they were organizing this conference. My only “complaint” is London’s rainy weather, which is not ideal for a Brazilian like my, maybe an upcoming edition of the conference could be held in South America?

On the left there is the line up for the “main room” and on the right for the “research room”. The research area was extremely popular and often full.

Below I have summarized my main takeaways and personal reflections of the event. Also, I tried to summarize the presentations I attended (unfortunately the event had simultaneous talks, so it was impossible to watch everything). Soon, StatsBomb will release videos of almost every presentation for those who wish to see them in full. I hope you can enjoy the material as much as I did.

My Takeaways and Reflections

  • Sports science can yield incredible results, especially with the support of robust data analysis. Ajax is the perfect example of this.
  • Many extremely intelligent people are developing similar models that seek to understand the value of different actions in the field. It seems to me as a “natural” next step after the creation and consolidation of expected goals (xG) models: now that we can measure the quality of shot, we need to know the value of passes, dribbles, tackles, interceptions, etc.
  • Tracking data (data on each player’s position on the field at all times of the game) is now available to clubs, but few independent analysts have access to this type of data. We already have the first results of tracking data analysis, but there is still a sea of ​​possibilities.
  • It is commonly agreed that communication is a vital aspect of producing effective football analytics. Coaches and players usually do not have much time (or interest) to dig into complex mathematical models, they just want to know what they must do to improve and win the next match. One of the analyst’s main tasks is to know how to extract the main points of an analysis and communicate them simply, clearly and convincingly.
  • Smart clubs already use data analytics in the following ways:
    • Opposition analysis (other team’s and their players performance)
    • Self-analysis (team and player performance)
    • Development of game models
    • Scouting and recruitment
    • Long-term scientific research (best ways to play, how to correctly evaluate a player, which drills produce shape better technique for young players, etc.)
  • There is already a huge gap between teams from South America and teams from Europe due to budget differences. If our (South Americans) clubs do not incorporate data analysis into their processes, the distance is likely to become even larger. This will certainly affect the national teams as well. It is not enough to have top South American players playing in Europe, our coaches and football executives also need to be aware of innovation in football.
  • I believe Algolritmo has the potential to bring some of this more analytical view into Brazilian football. My idea is to produce content that can start other types of debates about our clubs and players. However, I want to make it clear that I am not the kind of person who belittles and belittles the more “romantic”, humorous, and sociological views of football, in fact I am a big consumer of this variety of material. I just think there are also other ways to enjoy this sport.

Abertura da Conferência

Speaker: Ted Knutson, StatsBomb’s co-founder and CEO.

The conference began with a brief opening by Ted Knutson, founder and CEO of StatsBomb. According to Ted, football data analytics, is no longer an “early stage field”, it is a reality and the world’s top teams are increasingly realizing the value of these tools. Liverpool, which has 4 professionals with PhDs in areas such as physics, mathematics and astrophysics, is a great example of this. StatsBomb is a great data provider, and at the moment it is the only company to include information such as:

  • Pressures
  • Location of all players at shot moment
  • Shot Height
  • Foot used in actions (shots and passes)

In addition, Ted also talked about some of the next steps for the company. In the long run, they intend to:

  • Integrate videos into their data platforms
  • Start to collect tracking data
  • Offer “live” data to be used by teams (and perhaps media) throughout matches

Lastly, Ted talked about some StatsBomb initiatives that I admire a lot. The company offers some free data analysts who want to get familiar with football data experiment (at the moment, for example, anyone has access to all 2018 World Cup games and all career games. by Lionel Messi). The company also offers free access to their data analysis platform to women’s teams.

Analytics as a Vocabulary: Giving Stats the Power of Language

Speaker: Seth Partnow, former research director at the Milwaukee Bucks and NBA analyst writer at The Athletic

Seth Partnow was the only “outsider” speaker, he has worked at the NBA’s Milwaukee Bucks and now writes about basketball analytics on The Athletic. It was very interesting to hear someone who worked in a sport where analytics is already at a more advanced stage. Seth talked a little about the arbitrariness that exists in defining statistics in most sports. Often analysts spend most of their time developing fancy models, when in fact they should discuss which fundamental aspects of the game need to be measured.

The key to successful data analysis is knowing what to count and how to count. This may seem trivial, but it is actually complex and somewhat philosophical. Seth said something I have heard in other lectures and strongly agree: “The value of a data science project is inversely proportional to its complexity.” In other words, the most useful and valuable analysis is usually simple, understandable and easy to apply. Descriptive statistics is a very powerful tool that is often underrated. In addition, metrics designed for sports should have good nomenclature, that is easy to understand and self-explanatory. Seth showed some examples of what he considers good and bad names (see image below).

Some examples (good and bad) of sports analytics metric names

At the end, Seth also discussed the issues with rating players as superstars. The NBA has grown a lot globally in recent years and players are increasingly being treated as big stars. Along with this status, there is also the idea that this group of individuals is so extraordinary that a “superstar” can take the team to the playoffs and make it a contender for the title almost on his own. Seth showed an analysis of the number of wins and title probability that different players add to teams (not including player names). According to this analysis, very few players can really, by themselves, dramatically increase a team’s title chances. In my casual basketball watching experience, these players today are: Lebron James, Kevin Durant, Stephen Curry, James Harden and Giannis Antetokounmpo.

Partnow showed discussed player evaluation in terms of their offensive value

Understanding Entry Zones in Football

Speaker: Estefania Vidal, researcher and PhD candidate at MPIDS (Max Planck Institute for Dynamics and Self-Organization)

Estefania was one of the participants of the research competition who was chosen to present. In her study, Estefania divided the field into 4 zones (defense, defensive midfield, offensive midfield and attack). The idea of the research was to better understand how teams enter the attacking zone, what are the most common patterns and which strategies are the most effective. Some of the key learnings and conclusions were:

  • If a team cannot shoot within 12 seconds after entering the attacking zone, it is best to return to the midfield zone, regroup and start a new attack. In other words, when a team initiates an offensive attempt it must be completed quickly to have a better chance of scoring a goal.
  • Generally speaking, teams tend to enter the final “quarter” near the ends of the field (closer to the sidelines than the center of the field).
    • When entering through a pass, the ball tends to be “opened” with passes directer toward the sidelines.
    • When entering through ball carry, there is a tendency to go centrally, towards the goal.
  • Passes are generally more effective weapons for entering the attack zone than carrying the ball and dribbling, but it is true that all depends on the context and available players.
Estefania investigated which types of passes are more efficient for entering the “attacking zone”

How to Shoot and How to Save: Football Analytics for Dads

Speaker: Łukasz Szczepański, quantitative analyst at Smartodds

Łukasz’s research idea came about when he was playing football with his children in a park, and realized that he was giving advice without the slightest scientific basis to the kids. His children asked where they should aim when they were shooting, and he answered “as close as possible to the upper angles of the goalposts.” It was then that Łukasz realized that he was not sure if this was the best recommendation and did what any parent would do: a scientific research to answer the question with evidence. After much study, Łukasz concluded that ideally a player should aim at the ball and relatively close to the goal posts (but not right next to them). The researcher took into account that there is a margin of error for shooting. Everyone has already felt the presence of that margin of error when we trying to place at shot at the top corner and sending the ball miles away. Shooting at the exact intended location right is extremely difficult.

Goal probabilities and optimal aim spots according to different skill levels

Łukasz also analyzed goalkeeper positioning. The main conclusion being that goalkeepers tend to “cover” too much the nearest post, leaving the fart post too exposed. The presenter made a point of highlighting the assumptions of the model and its limitations. For example, there is no data that shows a player’s aim for a given shot. Therefore, he assumed that on average, players get the shots right where they want them (but with some variability and distribution involved).

“How do they do it?“ – Technical Analysis of Elite Skills in Football

Speaker: Vosse de Boode, head of sports science at AFC Ajax

I was very impressed with what I saw at Vosse de Boode’s presentation. The consensus with other participants that I spoke with was that this presentation was one of the great highlights of the conference. Vosse showed several examples of how Ajax uses science to understand the best ways to play football (especially in terms of technique) and how to teach it to players. Ajax has a real laboratory for these studies, where they can do different experiments using scientific methods. The focus of the work in the Ajax lab is to generate knowledge and try to pass it on to players, especially to younger athletes, who are still learning and on the developed stage. Of course, the the first team players also benefit from lab’s findings, but it is at the academy level that there is the greatest potential for technical growth.

One of the most interesting examples presented by Vosse was about goalkeeper posture and stance. Generally, goalkeepers are advised to keep their legs open at shoulder width, but they saw that Onana (the goalkeeper bought from Barcelona) kept his legs wider than “recommended”. Instead of trying to change the stance of the Cameroonian keeper, the Ajax lab decided to study that unusual pose. The researchers concluded that Onana’s stance was more efficient than the “traditional recommendation” because the goalkeeper would be able to cover a larger area of the goal and jump better (especially downward where most shots are directed).

Ajax does experiments with special eyeglasses, that can analyze a player’s visual focus.

In addition, Ajax also does research players’ visual focus. With special glasses, the lab can analyze what players observe and their main points of focus when performing different football-related actions. One of the lab’s key findings is that the best finishers take a good look at the target once, then look directly at the ball and shoot. It is amazing to imagine the endless possibilities for studies that this lab is capable of and how it can positively impact player development.

Ajax seeks to learn how to improve player technique, visual focus during shots is a good example.

Unfortunately the video for this talk is one of the few that will not be available on YouTube due to an agreement with Ajax. However, much of Ajax’s research is published in academic journals.

Finding the Free Man: A Contextual Approach for Identifying Spaced that Matter

Speaker: Javi Fernández, Head of Sports Analytics at FC Barcelona

Javi is one of the top names in football analytics and has lectured at other conferences I’ve been to, such as the Sloan Sports Analytics Conference at MIT. Since he works for Barcelona, his examples and videos are always related to the Catalan team, which is always pleasurable for those in the audience.

The speaker began his presentation with a question: “Aren’t we, in general, focusing too much on shots during our analyses?” Of course, scoring goals is the ultimate objective of football teams, and it is natural to study shots since they are goal-generating actions. However, we should be studying the other different actions that lead up to goal scoring situations. Javi’s idea was to create a model that could generate this kind of analysis. Inspired by NBA models, Javi (along with Luke Bornn and Dan Cervone) created the Expected Possession Value (EPV) model, which assigns a value to virtually every action in the field. If you already understand the concept of expected goals, also think of EPV as an xG model that took steroids, learned kung fu and bought a sword: It’s a way more powerful.

Basically, EPV tells you the probability of a team scoring (or conceding) a goal, given several contextual variables of the game (especially the location and arrangement of players on the field). The result of an EPV model is always a number between -1 and 1. The closer to 1, the greater the chance that the analyzed team will score at any given time and the closer to -1 the greater the chance of conceding a goal. The EPV model considers different actions a player can execute and how each one of them would influences a team’s probability of scoring (or conceding) a goal. The possible player decisions included by the model are:

  • Passes (and open pass options)
  • Shots
  • Dribbles (individual ball progression attempts)
Javi showed that the EPV model considers different contextual scenarios of a football match

For technical readers, the EPV model is a Markovian Decision Process. It considers contextual information of the match (i.e. where each player is located) and calculates the probability of a team scoring (or conceding a goal) given the different actions a player can take, the probability of a player making that decision and the likelihood of successfully executing it. According to Javi, EPV is a model understanding decision-making through probabilistic surfaces.

One of the most interesting ways to use the model is to look at the EPV value before and after an action and compare the difference. For example, if Arthur receives the ball in a situation where the EPV is 0.05 and gives a pass to Suárez, who receives it in a situation where the EPV is 0.12, we say that Arthur added 0.07 EPV to the play. In this talk, Javi focused more on how different actions generate and find space for the team. It was interesting to see that the model can capture the different ways that players influence the game, for example:

  • Arthur generates value for the team from his vertical passes that break through opposing lines. The Brazilian midfielder has been very prominent in this skill on the weeks rounds of the 2019-2020 La Liga.
  • Messi can do almost everything, but his ability generate spaces through successful dribbles is one of his attributes that adds the most value to Barcelona. Javi has also done a study of how Messi finds space simply by walking (not running) across the field (the search link is here)
Messi doing a stunt while being admired by Arthur and Pique

Pressing, Press-Breaking and Footedness: Exploring StatsBomb’s New Data

SpeakerMichael Caley, host of the “Double Pivot” podcast; writer at 538 and The Athletic

 Michael Caley is one of the few researchers who has studied football analytically for at least 7 years. I already knew his work through twitter and recently started listening to his podcast, the “Double Pivot”, hosted by him together with Michael Goodman. At this conference, Caley presented his research on “pressures” on the ball.

Michael Caley is one of the few researchers who has studied football analytically for at least 7 years. I already knew his work through twitter and recently started listening to his podcast, the “Double Pivot”, hosted by him together with Michael Goodman. At this conference, Caley presented his research on “pressures” on the ball.

Pressure is the main defensive action in football. The act of shortening an opponent’s space to regain possession (either by tackling or forcing mistakes). However, until very recently we had no “pressure” data collected, so analysts created different methods for measuring and approximating team pressure. StatsBomb now collects this kind of action.

Caley made a comparative analysis of the different pressure metrics developed by various analysts over the past few years. It was interesting to see that for each metric, the team “pressure ranking” was slightly unique, showing that each method reveals specific aspects about a team’s “pressure”. The presentation involved many technical details, I will include here just some of the main points.

The pressure metrics outlined by Caley were:

  • Passes per Defensive Action (PPDA):This is a simple calculation: (opponent’s successful open play passes) divided by (defensive team actions). This metric analyzes how many passes the opponent completes until the defending team performs any defensive actions (tackles, interceptions, challenges).
  • Moves Broken:This metric analyzes specific game situations, especially when the opponent attempts to build up from the back. The idea is to count how many times a team can “break” an opponent’s sequence of actions.
  • CB Zone Passes:This is a simple count of how many passes that end on the “centreback zones” are completed by the opponent. The idea behind the metric is that teams with good “pressure” do not let the ball reach this dangerous area of their own pitch.
  • High Turnovers:Number of times a team can recover the ball (through a tackling or interception) on their attacking pitch (I do not remember exactly which zones of the field are considered).
  • Number of Pressures:The above metrics are an “approximation” to a team’s pressures, they measure possible effects of a pressure, but not the pressure itself. Using StatsBomb data, Caley also analyzed the pressure actions of each team.

At times, the lecture seemed as a StatsBomb data “advertisement”, but this is comprehensible and fair since the data provided by the company is really excellent (I realized I just replicated the “advertisement” feeling in this sentence). Caley also showed an analysis of the best players who can break pressure and complete passes, see the photo below (I apologize as it is a little blury):

Ranking of the best players in terms of passing ability under pressure.

In addition, Michael Caley focused on a few teams during the presentation that stood out in different ways in terms of pressure patterns over the last two years:

  • Manchester City and Eibar were at the top of all pressure rankings.
  • RB Leipzig and Bayern Munich proved to be very effective in pressing, they exemplify the German press to some extent.
  • Torino, Barcelona and PSG are teams with inefficient pressure
  • Tottenham and Chelsea have experienced tactical shifts in terms of pressure strategy over the past two years.
  • Real Madrid, Real Sociedad, Burnley, Bournemouth, Olympique Marseille and Lazio all had some “weird” metrics according to Caley.
  • Parma did not perform well in any “pressure” aspect.
Caley analyzed some specific teams throughout his talk, outlining differences on pressure metrics

Finally, Caley had a great piece of advice at the end of the lecture. He said to always use Messi as a sanity check for your offensive models. If the Argentine is not at the top of the rankings than there is probably something wrong with the model.

How Football Could Revolutionize its Data Analysis

Speaker: Adrien Terascon, head of game analysis at Paris Saint-Germain

In this presentation, Terascon showed how PSG has incorporated data analysis into the club’s daily operations. The speaker presented some of the many very well defined processes that the team uses to, according to him, produce a “data-driven football comprehension”. According to Terascon, it is essential to have a clearly defined game model. For example, for PSG it is essential that the team is able to take the ball to the preferred crossing areas (PCA). The PCAs are the wide zones and within (and close to) the penalty box. Analysts know that the team can be dangerous if they take the ball there. PSG has several pillars for its game model, but Adrien Terascon could not reveal all the secrets, especially since analysts of potential opponents were paying a lot of attention in the audience.

After a game model is established, analysts develop various Key Performance Indicators (abbreviated to KPIs). It is important to avoid subjective definitions. The game plan needs to be measurable and easy to understand. From the game model the club creates several other models and processes, such as:

  • Micro game models
  • Individualized models (which includes creating specific training sessions)
  • Scouting processes
  • Player evaluation processes
  • Opponent analysis
The photo is a bit blurred, but we can see some examples of personalized instructions that PSG created for Gueye, Herrera and Marquinho for the match against Real Madrid.

The PSG Head of Analysis also spoke about the need for “individualization” in data analysis, which means a great effort to study each player in order generate personalized instructions (all of them cohesive within the same game pillars). Terascon stressed that each player learns differently, and the coaching staff needs to be aware of this. Thus, instructions must be communicated to each player in the best way possible. He said, for example, that Mbappe does not absorb much new information through lectures or reports. For the young French player, what works best is specific training sessions that reproduce what he should do on the pitch.

Another point emphasized by Terascon was simplicity in communication. Club managers, players and other staff do not need to be aware of all the information generated (with countless pages of charts and tables), as they can get overloaded and disinterested. It is crucial to curate the material. Terascon repeated several times that communication needs to be extremely clear, simplified and straightforward.

During the lecture, he also showed an example of a pre-match analysis his staff created for the game against Real Madrid (which was won 3-0 by PSG in the current Champions League edition). They knew they needed to be careful with Real Madrid’s offensive transitions and crosses (Real is one of the top teams that crosses the ball the most). With that in mind, they sought to provide players with different instructions so that PSG would be conscious about which regions it would be acceptable to lose possession: the team sought to take more risks in areas in which Real Madrid would not have high chances of creating successful counterattacks. They also tried to neutralize players who start the transitions and complete crosses.

Terascon explained that simplicity is key in terms of communication.

Another interesting thing that PSG does is “data tuning”. The team tries to use players in positions they know they can over perform attract the attention of potential buyers. Terascon cited an example of a center back (his name was not mentioned) that the club was recently willing to sell, and they started using him as a left-back since they believed the player would have produce better numbers, increasing the value of a future transfer.

Unfortunately, just like Ajax’s Vosse de Boode’s lecture, this presentation will not be shared online. Still, it is good to keep a close eye on PSG has joined the “smart football” movement.

Some things aren’t Shots: Comparative Approaches to Valuing Football Actions

Palestrante: Thom Lawrence, StatsBomb’s CTO

Thom Lawrence, StatBomb’s CTO, was another speaker who sought to understand actions in football other than shots. Thom believes football has essential elements that go far beyond goals and finishes, but he simultaneously fears that deep down the sport may be a simple pursuit of good shots to easily score goals. He enjoys models such as EPV, but knows it has limitations (such a difficulty to define exactly what a possession is in football for example). So, he tried to analyze the problem using various methods.

Thom Lawrence began his talk by commenting on some issues he sees in Expected Possession Value (EPV) models.

In this presentation, Thom spoke in detail about different reinforcement learning techniques (a specific “type” of artificial intelligence) that could be applied to understand the value of various actions in football. According to the presenter, some of these models are computationally heavy and he joked that the ideal way to develop them is to “run these models on AWS machines paid by your club/company until you make them bankrupt”. One of the main advantages of the presented techniques is the fact that models can be created with raw data. You do not need to do extra data cleaning, such as adding zones from coordinates (x, y).

Some of the principles of “test-driven development”

Thom also gave advice for people developing models of this kind: always have tests to make sure it’s working as expected (we know theoretically this should be done in any software development project, but in reality it rarely happens). Also, I found it interesting that he pointed out that this type of model usually values the attackers too much and values the defenders too little, this happens because:

  • Strikers are often in situation with a high chance of being rewarded (with goals) and few risks (rarely their mistakes result in opposing goals)
  • Defenders are often in situations with high risk (mistakes can be costly) and little chance of reward (defensive actions don’t always directly lead to goals being scored).

Thus, it is interesting to “normalize” the value of each action according to the options available to the player. This way, you can evaluate which players make the best decisions, because it analyzes the player’s action given what was possible in context.

There was also an interesting discussion about signal vs. noise. Usually, there is little signal and a lot of noise. In the football world, where virtually everyone (fan, coach, journalist, director, and players) consider themselves “expert”, it is especially hard to know what really matters.

StatsBomb Live Podcast:

Speakers:

The StatsBomb podcast is one of the best known in the world of football analytics, and it was nice to be able to see/hear it live. The hosts spoke about the current state of analytics, especially in Europe, and how fast we are moving forward. Teams that play in the world’s most competitive leagues are increasingly realizing the value of data analytics and are hiring skilled labor. Liverpool, who won the Champions League, for instance has 4 PhDs and contributed a lot to spreading the value of data analysis. According to the podcast participants, we are seeing fewer “stupid” transfers, especially in England (I think the same cannot be said for most teams in Brazil yet). After that, the main topic was the English Premier League. They made brief, humorous but informative reviews of the tournament’s top teams.

(Left to right) Pugsley, Knutson and Yorke on the StatsBomb Live Podcast

Towards the end, when talking about the football analytics industry, the speakers stressed that in order to grow in this space it is essential that you show things that people do not know (no matter how simple the ideas are). This is what I try to do here at Algolritmo, with a focus on football played in Brazil, where the statistical analysis has not yet arrived at full speed.

Other Conference Talks:

Website | + posts

Algolritmo's founder. I have a bachelor's in Computer Science and a master's in Analytics. My goal is to bring a new perspective into Brazilian football. I'm particularly interested in communicating complex ideas through simple data visualizations.

I graduated in Computer Science and Business Administration at the University of Southern California and got a masters degree in Analytics at the same institution. I have worked as a Data Science intern at companies such as Facebook, Itaú and Looqbox.

2 thoughts on “A Brazilian at the Football Innovation Conference

  1. Rafa! Obrigado por compartilhar sua experiência, agora também um pouco nossa.
    É incrível como a inteligencia pode trazer grandes ganhos, em todos os sentidos! Espero que o Tricolor olhe para você! Ou você para ele…
    Como sempre, você mandando muito bem!

Leave a Reply

Your email address will not be published. Required fields are marked *