BookmarkSubscribeRSS Feed

Europe's top footballers investigated with SAS

Started ‎01-28-2022 by
Modified ‎01-28-2022 by
Views 3,391
SAS OnDemand for Academics has replaced SAS University Edition as a free e-learning option. Hit the orange button below to start your journey with SAS OnDemand for Academics:
 

 

Football (Soccer to my American friends) is very much a team game. No matter how good a star player is they pexels-rfstudio-3621104.jpg need  a good team around them or they won’t shine. Nevertheless star players are celebrated and receive awards for their achievements. One of the most prestigious of these awards is the Ballon d’Or (French for ‘Gold Ball’) awarded annually since 1956 to the outstanding male player of the year by the French football magazine France Football. Originally limited to European born players it was subsequently opened up to any player competing in European professional football and finally to any professional anywhere. The trophy is awarded after a vote by football journalists, international coaches and national team captains and is highly prized.

 

In this edition of Free Data Friday, we will be looking at data covering the winners of the Ballon d’or from 1956 to 2018 to see what we can learn from it about European football.

 

Get the data

 

The data can be downloaded as a CSV file from the Data.World web site (free registration is available).

 

Get started with SAS OnDemand for Academics

 
In this 9-minute tutorial, SAS instructor @DomWeatherspoon shows you how to get your data into SAS OnDemand for Academics and other key steps:

Get Started

 

Getting the data ready

 

I used Proc Import to bring the data into a SAS data set. There were no issues with the imported file.

 

filename reffile '/home/chris52brooks/BallonDor/ballondor.csv';

proc import datafile=reffile
	dbms=csv
	replace
	out=ballon;
	guessingrows=700;
	getnames=yes;
run;

 

This is what the file looks like

 

BDDS1.png

 

The results

 

Firstly I used some simple Proc SQL to create files containing the total number of wins for each player, the total number of wins for each club, and the total by player nationality . This was done with some simple pieces of SQL.

 

proc sql;
	create table playerrecord as
	select distinct player,
	count(rank) as wins
	from ballon
	where rank=1
	group by player
	order by wins desc;
quit;

proc sql;
	create table clubrecord as
	select distinct club,
	count(club) as wins
	from ballon
	where rank=1
	group by club
	order by wins desc;
quit;

proc sql;
	create table countryrecord as
	select distinct nationality,
	count(nationality) as wins
	from ballon
	where rank=1
	group by nationality
	order by wins desc;
quit;

I then displayed the results using Proc SGPlot

 

ods graphics / reset;
proc sgplot data=playerrecord(obs=5);
	title1 "Ballon d'Or Winners (1956-2018)";
	title2 "Top 5 Winners";
	footnote j=r "Data From: https://data.world";
	hbar player / response=wins
		datalabel datalabelattrs=(weight=bold) categoryorder=respdesc;
	xaxis grid label="Number of Wins";
	yaxis grid  label="Player";
run;

ods graphics / reset;
proc sgplot data=clubrecord;
	title1 "Ballon d'Or Winners (1956-2018)";
	title2 "Top Clubs";
	footnote j=r "Data From: https://data.world";
	hbar club / response=wins
		datalabel datalabelattrs=(weight=bold) categoryorder=respdesc;
	xaxis grid label="Number of Wins";
	yaxis grid  label="Club";
run;

ods graphics / reset;
proc sgplot data=countryrecord;
	title1 "Ballon d'Or Winners (1956-2018)";
	title2 "Top Nationalities";
	footnote j=r "Data From: https://data.world";
	hbar nationality / response=wins
		datalabel datalabelattrs=(weight=bold) categoryorder=respdesc;
	xaxis grid label="Number of Wins";
	yaxis grid  label="Nationality";
run;

This is what the results looked like 

 

BDChart1.png

 

BDChart2.png

 

BDChart3.png

 

We can see that up until 2018 two players (Cristiano Ronaldo and Lionel Messi) are tied for the most wins (for the record Messi also won in 2019 and 2021 putting him ahead - no trophy was awarded in 2020 due to COVID-19). There is also a tie for top club between the two Spanish Giants Barcelona and Real Madrid. However if we look at winners nationalities we can see something interesting.Despite Spanish clubs winning 22 trophies between them only 3 wins went to players of Spanish nationality. I decided to find out who, exactly had won for them.

 

proc sql;
	create table spanishrecord as
	select distinct player,
	club,
	nationality,
	count(player) as wins
	from ballon
	where rank=1 and club in("FC Barcelona" "Real Madrid CF")
	group by player
	order by club desc;
quit;

ods graphics / reset;
proc sgplot data=spanishrecord;
	title1 "Ballon d'Or Winners (1956-2018)";
	title2 "Winning Player Nationality for Spanish Clubs";
	footnote j=r "Data From: https://data.world";
	hbar nationality / response=wins
		datalabel datalabelattrs=(weight=bold) categoryorder=respdesc;
	xaxis grid label="Number of Wins";
	yaxis grid  label="Nationality";
run;

BDDS5.png

 

BDChart4.png

 

Only two players of Spanish nationality (Alfredo Di Stefano with 2 wins and Luis Suarez Miramontes with 1 win) contributed to the Spanish club totals. Moreover Di Stefano was born, brought up and originally played for Argentina.

 

This, coupled with the fact that Dutch players have won 7 times but only a single win (for Ajax) went to a Dutch club, shows us how cosmopolitan European football is. Players move from country to country a lot and the top players especially so making the clubs a rich mixture of nationality and cultures.

 

Now it's your turn!

 

Did you find something else interesting in this data? Share in the comments. I’m glad to answer any questions.

 

Hit the orange button below to see all the Free Data Friday articles.

 
Version history
Last update:
‎01-28-2022 09:28 AM
Updated by:

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags