SAS programming concepts in this and other Free Data Friday articles remain useful, but SAS OnDemand for Academics has replaced SAS University Edition as a free e-learning option. Hit the orange button below to start your journey with SAS OnDemand for Academics:
Later this month Theresa May will travel to Buckingham Palace and formally resign her post as Prime Minister of the United Kingdom. At the same time she will resign her other, lesser-known job as First Lord of the Treasury and it’s in this role, not as is often popularly believed, in her role as Prime Minister that she uses number 10 Downing Street as her official residence.
As soon as she has resigned, she will leave “Number 10” to be replaced by her successor. Normally when someone moves out of their residence one of the last things they have to do is read their electricity meter and inform the power company that they are leaving. Of course, in the case of Theresa May this will not be the case as “Number 10” is essentially a place of work, but we are still able to read the meter for her as the UK government makes its electricity usage at Number 10 available online through the web site Carbonculture.net
In this edition of Free Data Friday, we will be looking at a year's worth of electricity usage at 10 Downing Street to see if we can find anything interesting or unusual in the data.
The data is not available in real time but is released periodically. The last year for which we have a full set of data is 2017 and so I downloaded that as a CSV file from the link on the web site.
I opened the CSV file in Excel and deleted some unnecessary header rows, then saved the result as an XLSX file so I could use the SAS XLSX engine to read it. This is the code I used followed by an image of the file.
options validvarname=any;
libname elecuse xlsx "/folders/myshortcuts/Dropbox/number10_2017.xlsx";
It is in what I always refer to as “spreadsheet format”, that is short and wide – the time variables are much easier to work with if they are transposed into a single variable making the file long and thin. This is done using the following Proc Transpose statement:
proc transpose data=elecuse.data out=notentrans;
by a;
run;
This gave me a file like this:
In order to analyse the data I ran the following data step to create day and month variables using the datepart function. I also corrected an issue with the transposed time variable for midnight – this was being converted into a datetime value in 1899….
data notentrans_cleaned(drop=a _name_ _label_ rename=(col1=usage));
format date ddmmyy10. time time20.;
set notentrans;
if substr(_name_,1,4)="1899" then time='00:00:00'T;
else time=input(_label_,time32.);
date=datepart(a);
Month=month(date);
Day=day(date);
run;
Here is the output from the data step:
Finally I ran a Proc Summary to aggregate the data by month and day:
proc summary data=notentrans_cleaned;
class month day;
var usage;
output out=stats sum=;
run;
Here's the output from the Proc Summary
In order to look for trends or anomalies in the data I decided to create a heat map using the HEATMAPPARM statement in Proc SGplot:
ods graphics / reset=all imagemap;
title1 "Electricity Usage in 10 Downing Street - 2017";
footnote j=l 'Data From: https://platform.carbonculture.net/places/10-downing-street/9/';
proc sgplot data=stats(where=(_type_=3));
heatmapparm x=month y=day colorresponse=usage /
discretex discretey colormodel=threecolorramp
tip=(day month usage)
tiplabel=("Day: " "Month: " "Usage: ");
gradlegend / title='Usage in kWh';
run;
We can see that higher usage is generally associated with the winter months of December, January and February. This is to be expected given the colder weather and fewer hours of daylight. There is, however, an interesting spike in electricity use from 19 to 22 June inclusive (the tool tips help a lot in identifying the specific dates). The question is, what happened on those days to cause this sudden increase?
We can’t be completely certain, but two events occurred during this period:
Both of these events would, in all likelihood, have resulted in a big increase in activity in Number 10 and this gives rise to an interesting proposition - if we had real time data from the electricity meters could we detect periods of intense political activity before they became public knowledge? We’d need more data to be certain but it’s an interesting avenue for future exploration.
Did you find something else interesting in this data? Share in the comments. I’m glad to answer any questions.
Visit [[this link]] to see all the Free Data Friday articles.
Need to correlate the values against days that she was in session. Is calendar data easily available for that information?
Hi Reeza and thanks for the comment. If you mean the parliamentary session there is data available but not in an easy format to analyze. The UK Parliament web site has a sessional diary available but it's a PDF. Moreover sessions are odd things - they normally last a year but can be longer or shorter (the current session has run for over two years). Also the House doesn't sit for anywhere near the full length of a session (MPs like their holidays) and traditionally nothing much happens on a Friday...
An interesting analysis. It will probably be useful after some time to compare these data with those obtained after the work of Boris Johnson. Planning to do such an analysis?
Thanks @frobi - I suspect when we get the data for 2019 it will be very interesting. With the multiple Brexit deadlines, a new Prime Minister and whatever else the rest of the year may hold there will be plenty of opportunity for analysis.
How does that data compare with previous PMs? Or members of the House of Commons? Or to regular citizens in the area? Seems like the population sample size of 1 is rather small.
Thanks @tomrvincent - several good points there. Unfortunately although the web site has data for earlier years it isn't always very reliable. For example the data during the David Cameron premiership contains a lot of missing values and repetition of the meter reading values which make this sort of analysis not really viable. Also although 10 Downing Street looks like a normal house when shown in news photos or TV it is actually huge. According to Wikipedia it was originally 3 houses and contains about 100 rooms and is more an office building than a residence (quite a few recent Prime Ministers have chosen not to live there but to just use it as a place of work).
You're right a sample of 1 is small 🙂 but it's a pretty unique building anyway.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.