Editor's note: SAS programming concepts in this and other Free Data Friday articles remain useful, but SAS OnDemand for Academics has replaced SAS University Edition as a free e-learning option. Hit the orange button below to start your journey with SAS OnDemand for Academics:
If, like me, you’ve been avidly following the final season of HBO’s Game of Thrones you’ll undoubtedly have your own favourite character. Mine is Sandor Clegane AKA ‘The Hound’. The Hound is afraid of only one thing – fire, and so for this week’s edition of Free Data Friday I’m going to be examining incident data from the London Fire Brigade.
You can get the data from the London Datastore either by downloading it manually or using Proc HTTP.
/* The URL is very long so lets build it up bit by bit */ %let url=https://data.london.gov.uk/download/london-fire-brigade-incident-records/; %let url=&url.b8f76a50-c7a0-4ff4-b3e4-7a42c5d0e8e3/LFB%20Incident%20data%20from%20January%202017.xlsx; filename out "/folders/myshortcuts/Dropbox/LFB Incident data from January 2017.xlsx"; proc http url="&url" method="get" out=out; run;
The data is in XLSX format but was easy to import into SAS using Proc Import.
filename lfbxlsx '/folders/myshortcuts/Dropbox/LFB Incident data from January 2017.xlsx'; proc import datafile=out dbms=xlsx out=lfb_all_data; getnames=yes; run;
Unfortunately, I couldn’t find any metadata explaining the fields so a certain amount of educated guesswork (using Google search) was necessary to understand the meaning of certain columns.
You would naturally assume that most call-out incidents are related to fires, but I like to check these assumptions. SAS allows me to do that with a single Proc Means call. I used a where statement to limit the data to 2018 dates and aggregated to two levels – incident group and its components, Stop Code Description.
proc means data=lfb_all_data(where=(dateofcall >= "01Jan2018"d and dateofcall <= "31Dec2018"d)) noprint; class incidentgroup stopcodedescription; types incidentgroup incidentgroup*stopcodedescription; output out=lfbstats(drop=_freq_) n=total; run;
The output data set looked like this
Pie charts have a bad reputation, but I believe they can be useful for a quick overview of the data when the number of categories is very small, so I decided to chart the output with the new SGPie procedure. You’ll see from the code how I used the automatically created variable _type_ to limit the data charted to the highest level of aggregation. I also used some of the optional parameters to the procedure to place the labels outside the pie and to convert the values into percentages (even though the source data only has values). In addition, I requested a gloss data skin (simply because I prefer that look).
title1 'London Fire Brigade Call-Outs 2018'; title2 'Total Call-Outs by Incident Group'; footnote j=l 'Data from https://data.london.gov.uk'; proc sgpie data=lfbstats(where=(_type_=2)); pie incidentgroup / response=total dataskin=gloss datalabeldisplay=(category percent) datalabelloc=callout; run;
This is the resulting chart
This gave some very interesting results. Instead of Fire related call-outs being in the majority they are actually the smallest of the three types. Just under half of all call-outs turn out to be false alarms. It’s worth mentioning that the Special Service category is for non-fire related incidents e.g. flooding, traffic accidents, rescuing cats from trees etc. In order to examine this result more closely we can look at the stopcodedescription variable which we can illustrate with another Proc SGPie call (again using a where statement to limit output with the _type_ variable and the incidentgroup variable set to “False Alarm”).
title1 'London Fire Brigade Call-Outs 2018'; title2 "Breakdown of False Alarms"; footnote j=l 'Data from https://data.london.gov.uk'; proc sgpie data=lfbstats(where=(incidentgroup="False Alarm" and _type_=3)); pie stopcodedescription / response=total dataskin=gloss datalabeldisplay=(category percent) datalabelloc=callout; run;
Here's the output from that call
This gave another interesting result – most false alarms (a fraction over 75%) are classified as “AFA”. I had to resort to a Google search to find out that this stands for “Automatic Fire Alarm.” The apparently very high incidence of automatically triggered false alarms raised the following questions for me:
Could automated alarms be made more accurate without sacrificing safety; and
Is there enough incentive for manufacturers to do this?
Did you find something else interesting in this data? Share in the comments. I’m glad to answer any questions.
Visit [[this link]] to see all the Free Data Friday articles.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.