The coronavirus pandemic has hit businesses all around the world really hard. Of course for some, such as makers of personal protective equipment for hospitals and video conferencing services business may well have improved but for most these are tough times. Many companies may not survive the upcoming recession but many will come through just as they have in the past.
In this edition of Free Data Friday we will be looking at data detailing the oldest still existing company in almost every country around the world. The data is available from the web site https://businessfinancing.co.uk/ and can be downloaded in a number of different formats.
The data is available from Google Docs and in order to download data from all the tabs at once it can be downloaded as an XLSX file. If you want to download a single tab you can get it as a comma separated or tab separated file - I chose the XLSX file option.
I decided to use the XLSX library engine to read the file. This is quite easy - all you need to do is declare a library like so:
libname comps xlsx "/folders/myshortcuts/Dropbox/OldestCompanies.xlsx";
This creates a library with each tab on the spreadsheet represented as a data set. It looks like this
This is the output from Proc Contents run against the Africa data set
I decided to see what percentage of the oldest companies were in each category. In order to do this we need to utilise Proc Freq. Here is code for Africa followed by a partial listing of the output.
proc freq data=comps.africa;
table category / out=africafreq;
run;
I can show the output at a pie chart using Proc SGPie
title1 "Africa's Oldest Companies";
title2 'Percentage of Oldest Companies Per Country by Category';
footnote1 j=l 'Data Source: businessfinancing.co.uk ';
proc sgpie data=africafreq;
pie category / response=count dataskin=gloss
datalabeldisplay=(category percent) datalabelloc=outside;
run;
Notice how, even though I am using count as the response variable, by using datalabeldisplay=(category percent) SAS is displaying percentage values for slices of the pie.
The surprise for me here was that in 31.5% of African countries the oldest company was in the banking and finance sector. Given the continents reputation for mining and mineral wealth I had expected that sector to be dominant.
I then ran the same analysis for Europe
proc freq data=comps.europe;
table category / out=europefreq;
run;
title1 "Europe's Oldest Companies";
title2 'Percentage of Oldest Companies Per Country by Category';
footnote1 j=l 'Data Source: businessfinancing.co.uk ';
proc sgpie data=europefreq;
pie category / response=count dataskin=gloss
datalabeldisplay=(category percent) datalabelloc=outside;
run;
There's another surprise here with brewing at 21.7% comprising the largest sector.
It seems that there are significant geographical variations in the type of company which survives to become the oldest in any given country so finally I wanted an overall view to compare them with. As each continent was represented as a different data set I could have appended them all together to create another large file. That, however, would have been wasteful in terms of disk space so I created and SQL view. A view is a virtual table based on the result of an SQL statement. In this case the size of an aggregate table would have been relatively trivial but it can result in big savings. Here is the code to create the view.
proc sql;
create view all_countries
as select *
from comps.africa
union
select *
from comps.asia
union
select *
from comps.europe
union
select *
from comps.'north america'n
union
select *
from comps.oceania
union
select *
from comps.'south america'n
;
quit;
One thing to note here is that where the data set name contains spaces it must be enclosed by single quotes and have a trailing letter n.
I can now treat the view as if it were a normal SAS data set.
proc freq data=all_countries;
table category / out=allfreq;
run;
title1 "The World's Oldest Companies";
title2 'Percentage of Oldest Companies Per Country by Category';
footnote1 j=l 'Data Source: businessfinancing.co.uk ';
proc sgpie data=allfreq;
pie category / response=count dataskin=gloss
datalabeldisplay=(category percent) datalabelloc=outside;
run;
So overall, as in Africa, banking and finance is the clear winner in providing the largest number of countries with their oldest company.
Did you find something else interesting in this data? Share in the comments. I’m glad to answer any questions.
Visit [[this link]] to see all the Free Data Friday articles.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.