SAS programming concepts in this and other Free Data Friday articles remain useful, but SAS OnDemand for Academics has replaced SAS University Edition as a free e-learning option. Hit the orange button below to start your journey with SAS OnDemand for Academics:
Polls consistently show that one of the things people most enjoy about Christmas is spending time with their families. It’s a time when families gather, exchange presents, eat lots of good food and observe time-honoured family traditions. However, for some people this isn’t possible because they have to, or choose to, work on Christmas Day. In this edition of Free Data Friday we will be looking at an occasional series published by the UK Office for National Statistics (ONS) which tells us what percentage of workers turn up for work on Christmas Day.
The data is available for download as an XLS file from the ONS web site. The latest period for which the data is available is Christmas 2016.
Unfortunately, the ONS has published a number of different data tables on the same worksheet which makes importing the data into SAS a little more complicated. In this scenario the easiest way to proceed is to use the RANGE statement in Proc Import. You can either use an explicit cell reference or an Excel range name. It’s at this point that I have to make a confession – I am a former spreadsheet programmer. I have always said that the key to good spreadsheet programming is using range names. For example, in our case if the table we want is positioned in different cells in different years we can still use the same program without amendment. This wouldn’t be possible if we had used cell references. With this in mind I opened the spreadsheet, created range names for each of the tables and saved the file in the more modern XLSX format. You can find more information about using range names with Proc Import in an excellent blog post from Chris Hemedinger.
I decided to use two of the tables on the sheet so issued two Proc Import statements as follows
proc import datafile="/folders/myshortcuts/Dropbox/worker_christmas.xlsx" out=region_import replace dbms=xlsx; range="region"; run; proc import datafile="/folders/myshortcuts/Dropbox/worker_christmas.xlsx" out=occupation_import replace dbms=xlsx; range="occupation"; run;
This gave me two data sets looking like this
The next steps were to sort the regional file, renaming and dropping some variables and tidying up the occupation file, removing the codes at the start of the occupation name, removing quotation marks and similarly renaming and dropping variables.
proc sort data=region_import(where=(k ne 'Total')) out=region(rename=(k=region as_percentage_of_region=perc) drop=level); by descending as_percentage_of_region; run; data occupation(keep=occupation perc); length occupation $40; set occupation_import(rename=(as_percentage_of_occupation=perc)); occupation=dequote(strip(substr(f,6))); run;
The data is already aggregated so no further manipulation is required - instead I first printed the region table making use of Proc Print like so
title1 'Who Works Over Christmas?'; title2 'UK Regional Percentages of People Working on Christmas Day'; footnote1 j=c 'Data from the UK Office for National Statistics'; footnote2 j=c 'https://www.ons.gov.uk'; proc print data=region noobs split="*"; label region='Region*'; label perc='Percentage Working*'; run;
This gave me the following listing
One interesting point from this is the position of Scotland at the head of the table. More than twice as many workers in percentage terms work in Scotland on Christmas Day than in London and Northern Ireland. It is true that in Scotland New Year is considered a more important holiday than it is in other parts of the country and Christmas Day a little less so but to my mind this doesn't wholly explain the disparity between Scotland and all the other regions.
For the occupation data set I decided to create a vertical bar chart. Here is the code I used followed by the output
ods graphics / imagemap; title1 'Who Works Over Christmas?'; title2 'Top 15 Occupations Working on Christmas Day in the UK 2016'; footnote1 j=l 'Data from the UK Office for National Statistics'; footnote2 j=l 'https://www.ons.gov.uk'; proc sgplot data=occupation; styleattrs backcolor=biyg wallcolor=biyg; vbar occupation / response=perc categoryorder=respdesc fillattrs=(color=red) dataskin=sheen datalabel=perc datalabelpos=data tip=(occupation perc) tiplabel=("Occupation:" "Percentge Working:"); xaxis label="Occupation"; yaxis label="Percentage Working"; run;
My initial thought was that this chart would be dominated by the emergency services - fire, police and medical staff but when I saw clergy at the top of the list it made perfect sense. In fact 51.4% seems a little low even though this would be almost exclusively Christian clergy. The other surprise to me was the 8.5% of sales supervisors working when virtually all shops are closed on Christmas Day.
There is, of course, one occupation missing from this chart - it has only one person working in it and is a curious amalgam of delivery services, animal wrangling and toy manufacture. It is, as you might have guessed, the one man who has worked every Christmas Eve and Christmas Day since time immemorial - Santa Claus.
Did you find something else interesting in this data? Share in the comments. I’m glad to answer any questions.
Visit [[this link]] to see all the Free Data Friday articles.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.