This is the second text of my series of analyzes on the UEFA Champions League semi-final matchups. I recently published a text about Juventus vs. Monaco (https://algolritmo.com/index.php/2017/04/26/analysis-of-juventus-x-monaco/), today I use the same methodology to write about the upcoming Madrid rivalry: Real x Atletico.
Assumptions:
In previous texts, I have shown that data on the quality of the shots (also known as expected goals, abbreviated to exG) reveals a lot about a team’s performance. Today, I’ll add the shot “location” as another important factor, you will be able to see not only the quality but also the region where the shot occurred.
Another important assumption is that as the football season in Europe is nearing its end, I believe that there is already an interesting statistical mass to make an analysis of the offensive and defensive performance of a team.
Methodology:
First, I grouped the data provided by Stratabet, and made a subset with only the shot made and conceded by Real Madrid and Atletico Madrid during the current season. I applied a statistical technique called clustering to that data (if you want to know more read the notes at the end of the text) and calculated the average shots made and conceded by each of the teams. The concept is very simple; the shots are grouped per their similarity and form a cluster.
After that, I used Plotly, a tool that allows you to create interactive graphics, and prepared 2 graphs. The idea is to be able to see and analyze the shots (and average shots) for and against involving Monaco and Juventus.
For each chart, I inserted the offensive data of one team and the defensive data of another. That is, a graph with the title “Real Madrid Attacking vs Atletico Madrid Defending “includes all the shots made by Real Madrid and all the shots conceded by Atletico Madrid.
The size of each point in the graph indicates the quality of the chance created/suffered by the team. The bigger the point, the more likely that chance will result in a goal. In addition, by moving the mouse through the points you can see who the team was playing against, which player made the shot, the play type and the outcome of that chance (if it was defended, saved or ended in a goal). You can choose whether you want to visualize all the chances, the average chances (clusters) or only those that have ended in goals.
Besides that, at the end of the text I also prepared a special comparison between both team’s main stars, Cristiano Ronaldo and Antoine Griezmann.
Results:
In the chart below we can see the performance of Real’s attack and Atletico’s defense simultaneously (in other words, the chances created by Real and the chances conceded by Atletico throughout this season):
I suggest you first see Atletico de Madrid’s data. Since Diego Simeone was appointed as manager, Atletico became a major defensive game reference and the chart reinforces that thesis. Note that the team indeed concedes a considerable amount of shots during their matches, however note that the dots are considerably small. In other words, Atletico “allows” the opponents shoot, but only when the probability of scoring a goal is low.
Now display only Real Madrid’s data on the chart and see a picture of what it means to be one of the world’s best attacks. The center of the larger box is covered by a gray “stain”, with many relatively large dots, which indicates a huge headache for any opponent facing the merengue team, as the team creates a lot of high quality chances. While hovering the mouse above some dots it becomes clear a predominant (and even predictable) name: Cristiano Ronaldo. Even with Benzema, Bale (before getting injured), Morata and defender Sergio Ramos contributing to the attacking performance, the Portuguese player is still the team’s main weapon. Click on the “Clusters” option and Real Madrid’s offensive power will become even more evident, as there are two large gray dots near the small box that indicate the quantity and quality of the team’s shots in that region of the pitch.
Next, we’ll look simultaneously at Atletico’s attack and Real’s defense (i.e. the chances created by Atletico and the chances conceded by Real in the current season):
Again, leave only Atletico’s data on the chart. It is true that in recent years the team has become famous for its defensive solidity, however their offensive performance in the current season is very interesting. The team is not the one that finishes the most, but it manages to create many opportunities especially with Griezmann, who is the main name of the squad currently. In addition, less trendy (but very efficient) players like Carrasco, Gameiro, Correa and Saúl also contribute a lot to the offensive production. The team seems to know how to “choose” when to shot, which causes many of their finished to have high exG.
Continuing the analysis, now leave the Real Madrid’s data on the graph. The merengue team’s defense in general is very solid, however there are some groups of dots within the box that should worry the manager Zinedine Zidane. The team suffers defensively in some games and it concedes goals that could be hurtful, yet usually the team’s attack overcomes these defensive issues. One of the main reasons for Real’s defensive problems are certainly injuries. Even with Sergio Ramos (one of the best defenders nowadays) as a regular starter, the Real Madrid’s starting centre backs have changed a few times during the season due to the injuries of Varane and Pepe. Nowadays Nacho is featured in the starting team together with Ramos.
The two graphs presented above indicate that we will have another extremely fierce and open Madrid derby (something that has become routine in recent years), since the merengue attack and the colchonero defense are extremely efficient. Both sides have played 2 of the last 3 Champions League finals in thrilling games in which the Real ended victorious. In my opinion Real Madrid is the slight favorite, but I would not be surprised to see Atletico advance to the final.
Extra Comparison:
I believe that this text would be incomplete without making a comparison between Cristiano Ronaldo and Griezmann, two great attackers and who were nominated for the Ballon d’Or last year. The following comparison considers only the shot made by each of the two players:
The image leaves no doubt that Cristiano Ronaldo, although questioned by some, is having a great season. Of course, at the age of 30 the Portuguese player has already lost some of his famous explosiveness, however he remains lethal since he has been adapting his positioning and playing closer and closer to the box. Personally, I would like to see Cristiano Ronaldo “officially” as the team’s center-forward, for this I would remove Bezema from the team and play Isco open on the left. Looking now at Griezmann’s data, do not let the comparison with Cristiano Ronaldo make the Frenchman’s performance look bad. Griezmann plays in a different role, generally acting more as second striker who helps to create its plays, this is one of the reasons that leads the player to shoot less than Cristiano Ronaldo. Despite this, Griezmann has an excellent goal average and exG per game and needs less shots to score a goal compared to the Portuguese star (Griezmann needs only 4.38 shots to score a goal, while Cristiano Ronaldo needs 7).
It is almost certain that Cristiano Ronaldo will be one of the main candidates to win the Ballon d’Or this year, but if Griezmann can help Atletico to reach the Champions League Final, and maybe even win it, he will also become a very strong candidate gain strength. It sure is an extra ingredient to this week’s showdown.
Observations:
- The expected goals (exG) concept is very simple. The greater the exg of a shot, the better its quality, which means a higher probability of scoring a goal. I explained the metric in more depth in this text: https://algolritmo.com/index.php/2017/03/18/tite-vs-dunga-2/
- Stratabet provides data on shots and dangerous moments (attacks with a high chance of scoring but that ended without a shot), I used the two types of data in the charts.
- For the clustering analysis, I used the “kmeans” function of R, considering the location and the quality of each chance. There are several ways to determine the number of clusters to be extracted, I chose the number 6 after doing an empirical analysis on the dendogram of the completions of each team.
- I highly recommend the plotly library for R and Python, the tool is excellent
This article was written with the aid of StrataData, which is property of Stratagem Technologies. StrataData powers the StrataBet Sports Trading Platform, in addition to StrataBet Premium Recommendations.
Algolritmo's founder. I have a bachelor's in Computer Science and a master's in Analytics. My goal is to bring a new perspective into Brazilian football. I'm particularly interested in communicating complex ideas through simple data visualizations.
I graduated in Computer Science and Business Administration at the University of Southern California and got a masters degree in Analytics at the same institution. I have worked as a Data Science intern at companies such as Facebook, Itaú and Looqbox.
Parabéns.
Muito legal esse teu texto e principalmente as analises com dados e graficos demontrativos.
Keep walking !!
Obrigado Gustavo!