BookmarkSubscribeRSS Feed

Mapping London's Knife Crime Epidemic with SAS

Started ‎11-08-2019 by
Modified ‎08-03-2021 by
Views 4,431

Editor's note: SAS programming concepts in this and other Free Data Friday articles remain useful, but SAS OnDemand for Academics has replaced SAS University Edition as a free e-learning option. Hit the orange button below to start your journey with SAS OnDemand for Academics:

 

Access Now

action-asphalt-automobile-automotive-303313.jpg

 

While it is true to say that the United Kingdom has very little gun crime, it is also true to say that human ingenuity in finding ways to harm each other seems to know no bounds. There has been, over the last few years, increasing public concern over the level of  knife crime in the country, particularly in London, where it is often said to have reached epidemic proportions.

 

In this edition of Free Data Friday, we will be looking at knife crime data from the House of Commons Library to map incidents of knife crime across the London boroughs to try to establish if there is any geographical pattern to this epidemic.

 

Get the Data

 

You can download the data as an Excel file from the House of Commons Library along with an explanatory PDF report. We will also need a shapefile for London showing borough boundaries. This can be downloaded from the London Datastore – note that the page states that while we are free to use the shapefiles to create maps, we must include two copyright notices on the output.

 

 

Get started with SAS OnDemand for Academics

In this 9-minute tutorial, SAS instructor @DomWeatherspoon shows you how to get your data into SAS OnDemand for Academics and other key steps:

Get Started

 

Getting the Data Ready

 

We will be using Proc SGMap which can’t directly use shapefiles so firstly we must use Proc Mapimport to convert the shapefile into a SAS map data set.

 

 

%let mapfile=/folders/myshortcuts/Dropbox/statistical-gis-boundaries-london/ESRI/;
%let mapfile=&mapfile.London_Borough_Excluding_MHW.shp;

proc mapimport out=london datafile="&mapfile";
run;

 

Having done that, we can open the downloaded data file using the XLSX engine and read the required table into a SAS data set. As is usual in a spreadsheet there are header rows and a grand total we don’t need. In addition we can ignore the value for Heathrow airport. It is part of the London borough of Hillingdon, but the number of incidents is tiny. So we can just read in the data from row 6 to row 38. We will also rename some of the variables and allocate the data to category bands (each value will be unique which creates an issue with choropleth maps which we will see later. Creating bands is one way around the problem).

 

 

libname knife xlsx "/folders/myshortcuts/Dropbox/CBPSN04304.download.xlsx";

options validvarname=any;
data borough(keep=name total category);
	length name $50 category $7;
	set knife."T.A4C"n;	
	
	if _n_ > 5 and _n_ < 39 then do;		
		name=A;
		total=input(F,8.);
		if total < 250 then category="0-250";
		else if total < 500 then category="250-500";
		else if total < 750 then category="500-750";
		else if total > 750 then category="750+";
		output;
	end;
run;

 

One further point to note about the data is that it does not include figures for the City of London. This is the main financial district and although comprising only a little over one square mile, it has its own form of government and police force. It is a city in its own right and not a London borough, which explains its absence from the data.

 

Designing the Map

 

Creating any kind of visualization is always an iterative process. I tend to start with the simplest possible design and build on that while trying not to allow the chart or map to become too complicated. Firstly, we’ll see what a bare choropleth of what the imported shapefile looks like. I do this because I find that some imported shapefiles just don't look "right" for my purposes.

 

 

proc sgmap mapdata=london;
	choromap / mapid=name;
run;

 

Map1.png

 

This looks fine so we shall flesh it out with some data:

 

 

proc sgmap mapdata=london maprespdata=borough;
	choromap category/ mapid=name;	
run;

 

Map2.png

 

There are a few problems with this map:

 

  1. The boroughs are not named so unless you are familiar with London the map is of limited use; and
  2. The colours are not intuitive - you must consult the legend before having any idea which areas have higher figures than others; and
  3. The map has no title, so we don't know what data is displayed or what the source is; and
  4. The copyright notices are missing; and
  5. The legend starts with higher figures at the left - I would expect the lower figures to be first.

 

We can start the process of improving the map by adding the borough names. We can do this by using the centroid macro supplied with SAS University Edition to determine the central point of each borough and then using the TEXT statement in PROC SGMap to display the borough names at those points:

 

 

proc sort data=london out=london_sort;
	by name;
run;

%centroid(london_sort, centres,name);

proc sgmap mapdata=london maprespdata=borough plotdata=centres;
	choromap category/ mapid=name;	
	text x=x y=y text=name;
run;

 

Map3.png

 

This wasn't a resounding success - the names were placed in the correct position but they are too long for the borough areas so there is a lot of overlap. Instead of using names, I decided to use numbers for the boroughs

 

 

data centres;
	set centres;
	num=_n_;
run;

proc sgmap mapdata=london maprespdata=borough plotdata=centres;
	choromap category/ mapid=name name="choro";	
	text x=x y=y text=num;
	keylegend "choro" / title="Total Incidents";
run;

 

Map4.png

 

This is much better, but we still have some issues to address.

 

  1. Adding the title, data source and copyright notices is straightforward; and
  2. We can make the colors more intuitive by using a gradlegend instead of a keylegend - this gives us a gradient that is not only a more accurate depiction of the relative values, but allows us to tell at a glance which boroughs have the higher values; and
  3. A simple PROC Print provides a key to the borough numbers added in the previous step:

 

proc sort data=borough out=borough_sorted;
	by category;
run;

title1 "London Knife Crime Incidents by Borough - 2018/19";
title2 "Excludes data for City of London & Heathrow Airport";
footnote1 j=l 'Data from the UK House of Commons Library';
footnote2 j=l 'Contains National Statistics data © Crown copyright and database right [2015]';
footnote3 j=l 'Contains Ordnance Survey data © Crown copyright and database right [2015]';

proc sgmap mapdata=london maprespdata=borough_sorted plotdata=centres;
	choromap total/ mapid=name name="choro";	
	text x=x y=y text=num;
	gradlegend "choro" / title="Incidents";
run;

title;
footnote;

proc print data=centres noobs;
var num name;
run;

 

 

Map5.png

 

 

Table1.png

 

The Results

 

We can see from the final map that generally speaking the number of incidents rises the closer you get to the centre of London with the largest value being for borough 33 - Westminster. This is about as close to the center of London as you can get, plus boroughs south of the river Thames tend to have less knife crime than those north of the river. Having said all this, London is still an incredibly safe place to visit, work and live in so please don't let this dissuade you from including it in your holiday itinerary!

 

Now it's your Turn!

 

Did you find something else interesting in this data? Share in the comments. I’m glad to answer any questions.

 

Visit [[this link]] to see all the Free Data Friday articles.

 

 

Comments

Cool

Has anyone done something like this for the US, where we have hundreds murdered each week?

Hi @tomrvincent - there's quite a lot of shooting incident data held on individual city data portals, for example New York which goes down to borough and precinct level. It should be possible to find a suitable shapefile to import into SAS and create a choropleth map for gun crime. I don't know if the data is easily available at state or national level though.

@ChrisBrooks  I found a very thorough (and horrifying) database at https://www.gunviolencearchive.org/query but it only seems to allow 500 records to be downloaded at a time.

 

https://everytownresearch.org/gun-violence-america/ and https://lawcenter.giffords.org/ also have a variety of charts but no searchable databases from what I can tell.

 

Even with 100 gun deaths per day, there isn't a national standardized reporting protocol in place.

Nice example!

My one suggestion would be to add the number of Incidents to your Proc Print table. 

Version history
Last update:
‎08-03-2021 01:30 PM
Updated by:

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags