BookmarkSubscribeRSS Feed

Measuring Geographic Imbalance in the Brexit Vote with SAS

Started ‎10-11-2019 by
Modified ‎08-03-2021 by
Views 4,024

Editor's note: SAS programming concepts in this and other Free Data Friday articles remain useful, but SAS OnDemand for Academics has replaced SAS University Edition as a free e-learning option. Hit the orange button below to start your journey with SAS OnDemand for Academics:

 

Access Now

ballot-black-and-white-black-and-white-1550337.jpg

 

British politics has been dominated by one subject for the last three years – Brexit. The unexpected victory of leave voters in the 2016 European Union referendum has pushed all other topics to the periphery.

 

Since that vote, two prime ministers have resigned, the ruling Conservative Party has lost its overall majority in the British Parliament, there have been large-scale defections from both Conservative and Labour parties in the House of Commons and new parties have risen and fallen at a rapid pace.

 

Now, with the October 31 deadline for leaving the EU getting closer and closer, another general election is expected at any time. One feature of this election is expected to be the geographical split between leave and remain areas.

 

Although the UK voted by a margin of 52% to 48% to leave, that narrow result masks wide geographical differences. In this edition of Free Data Friday, we will be using results data from the UK Electoral Commission to look at a way of measuring the degree of difference between areas in that vote.

 

Get the Data

 

You can download the results of the vote from the UK Electoral Commission web site. The file can be downloaded in CSV format and imported into SAS with PROC Import.

 

Get Started with SAS OnDemand for Academics

 
In this 9-minute tutorial, SAS instructor @DomWeatherspoon shows you how to get your data into SAS OnDemand for Academics and other key steps:
 

Get Started

 

Getting the Data Ready

 

The file imported without any significant issues. There are a couple of minor points which we will need to address:

 

  1. The British Overseas Territory of Gibraltar is part of the EU by virtue of its relationship to the UK and therefore took part in the vote. However, it is not represented in the UK Parliament and therefore we will exclude it from our calculations; and

  2. Some of the fields which should be numeric were imported as character fields and will be converted.

Here is the code for the import and data cleaning:

 

 

filename reffile '/folders/myshortcuts/Dropbox/EU-referendum-result-data.csv';

proc import datafile=reffile
	dbms=csv
	out=import
	replace;
	getnames=yes;
	guessingrows=700;
run;

/* Area Code GI is Gibraltar */

data area_results(keep=region area valid_votes remain leave);
	set import(where=(area_code ne "GI"));
	new_remain = input(remain, 8.);
	drop remain;
	rename new_remain=remain;
	new_leave = input(leave, 8.);
	drop leave;
	rename new_leave=leave;
	new_valid = input(valid_votes, 8.);
	drop valid_votes;
	rename new_valid=valid_votes;
run;

 

 

This is what the imported file looks like:

 

DS File 1.png

 

 

 

The Index of Dissimilarity

 

The Index of Dissimilarity can be used to determine the percentage of one constituent part of the population which would have to move areas to achieve a uniform geographic distribution amongst the sub-areas of a larger area. It is often used to determine racial balance in areas within a state or country or gender balance within occupations.

 

In our case this would mean that an index of zero would mean that all the counting areas had a 52% leave to 48% remain vote identical to the overall total. The method of calculating this was discussed in a 2016 SAS Communities Forum thread and we will be using the PROC SQL statement from that thread to perform the calculation.

 

The Results

 

Here is the PROC SQL statement referred to earlier which calculates the index:

 

 

proc sql;
  create table uk_index as
  select *, leave/sum(leave) as var1, remain/sum(remain) as var2
  from area_results;
  select 0.5*sum(abs(var1-var2)) as d1
  from uk_index;
quit;

 

Here is the result:

 

Result all UK.png

 

 

This means that 16.5% of UK leave voters would have to move counting areas for the vote split to be uniform across all areas. On a turnout of 33,551,983 that is over 5.5 million people which is a significant imbalance.

 

This isn't the whole story however - while England and Wales voted for leave, the other two constituents of the UK, Scotland and Northern Ireland, voted for remain. I decided to see if the internal dissimilarity in England, Wales and Scotland was roughly the same (Northern Ireland was a single counting area so we can't calculate an index for it).

 

Here is the code for those calculations:

 

 

data regional_results_eng;
	set area_results(where=(region not in ("Scotland", "Wales", "Northern Ireland")));
run;

proc sql;
  create table eng_index as
  select *, leave/sum(leave) as var1, remain/sum(remain) as var2
  from regional_results_eng
  ;
  select 0.5*sum(abs(var1-var2)) as d1
  from eng_index;
quit;


data regional_results_scot;
	set area_results(where=(region ="Scotland"));
run;

proc sql;
  create table scot_index as
  select *, leave/sum(leave) as var1, remain/sum(remain) as var2
  from regional_results_scot
  ;
  select 0.5*sum(abs(var1-var2)) as d1
  from scot_index;
quit;


data regional_results_wales;
	set area_results(where=(region ="Wales"));
run;

proc sql;
  create table wales_index as
  select *, leave/sum(leave) as var1, remain/sum(remain) as var2
  from regional_results_wales
  ;
  select 0.5*sum(abs(var1-var2)) as d1
  from wales_index;
quit;

 

Here is the index value for England:

 

Result England.png

 

A score of 15.5 % is pretty close to the UK total (perhaps not surprisingly given the relative size of Englands population to the UK total).

 

This is Scotland's score:

 

Result Scotland.png

 

This is a lot less than the UK and England-only values and of course with remain having won, only a shade over 273,000 remain voters would have to move areas to achieve uniformity.

 

Finally Wales' index is:

 

Result Wales.png

 

This is the smallest of the three index values - only just over 144,700 leave voters would need to move to achieve the perfect zero index.

 

The question is, then, what does this mean for the upcoming election? The results imply that, especially in England, leave areas are more pro-leave than the country as a whole and remain areas more pro-remain. If people vote along strict leave/remain lines then this is likely to lead to a very polarised result particularly if parties campaign with a strategy of trying to mobilise their base votes rather than win over opposition voters. Having said that, events are moving at a very rapid pace and as is often said - the only certain thing with Brexit is that nothing is certain!

 

Now it's your Turn!

 

Did you find something else interesting in this data? Share in the comments. I’m glad to answer any questions.

 

Visit [[this link]] to see all the Free Data Friday articles.

Comments

So the people of the conquered lands, given the belated right to vote, would prefer to be part of an voluntary association by choice that they never got a chance to vote on joining in the first place.  That makes sense.

 

At least the UK doesn't have some sort of barbaric 'Electoral College'...who know what the result would have been... 🙂

Hi @tomrvincent there are a lot of very strange things going on in British politics at the moment. Bizarrely the opposition have twice blocked the holding of an early general election. Imagine Trump offering the Democrats an early presidential election and them turning it down....

 

As the saying goes "If you're not confused, you don't really understand the situation"!

Version history
Last update:
‎08-03-2021 02:11 PM
Updated by:

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags