Conventional wisdom has it that in politics, money talks. This is especially true when elections come round as a large war chest is regarded as essential for success at the ballot box. In this edition of Free Data Friday, we will be looking at data from the UK Electoral Commission detailing the spending of every party which won seats in the 2019 General Election to see if money really can buy you power in the world of UK politics.
The Electoral Commission web site has a search tool which allows you to filter parties spending by election and party name. The results can then be downloaded as a CSV file. There is one important caveat here – the filter isn’t 100% accurate as I discovered that you can’t specify an exact search term for party name. This means that searching for “Labour Party” returns results both for the Labour Party and the Socialist Labour Party, a completely different organization. There are a few other, similar, cases and so I had to include a clean-up step later in the processing. In total I had to download and rename ten separate CSV files.
Because there were ten files to import into SAS I wrote a short macro to import the data from CSV into SAS and execute a data step which only kept the variables we needed and converted the variable which held the expenditure column from character into a numeric variable.
%macro importcosts(fname);
%let fullfname="/home/chris52brooks/UKPartyExp/&fname..csv";
filename reffile &fullfname;
option validvarname=any;
proc import datafile=reffile
dbms=csv
out=&fname
replace;
getnames=yes;
guessingrows=1000;
run;
%let keepstatement=(keep=RegulatedEntityName ExpenseCategoryName TotalExpenditure);
data &fname;
set &fname.&keepstatement;
total=input(TotalExpenditure,nlmnlgbp10.2);
drop TotalExpenditure;
run;
%mend;
%importcosts(alliance);
%importcosts(cons);
%importcosts(dup);
%importcosts(green);
%importcosts(labour);
%importcosts(libdems);
%importcosts(pc);
%importcosts(sdlp);
%importcosts(sf);
%importcosts(snp);
This gave me ten data sets, each looking like this
I used another data step to append all ten files and check the RegisteredEntityName (effectively party) variable, dropping unwanted records as described earlier.
data allparties;
set alliance
cons
dup
green
labour
libdems
pc
sdlp
sf
snp
;
if RegulatedEntityName not in
("Alliance - Alliance Party of Northern Ireland"
"Conservative and Unionist Party"
"Democratic Unionist Party - D.U.P."
"Green Party"
"Labour Party"
"Liberal Democrats"
"Plaid Cymru - The Party of Wales"
"SDLP (Social Democratic & Labour Party)"
"Scottish National Party (SNP)"
"Sinn Féin") then delete;
run;
In UK elections it is normal for parties to not contest every possible seat and as I wanted to find out how much each party had spent on its candidates on average, I used Wikipedia to get the number of candidates and seats won by each party in the election and created a data set holding that information. There are two things to note here.
data results;
infile datalines dlm=",";
length RegulatedEntityName $45 candidates 8. winners 8. NI 8.;
input RegulatedEntityName candidates winners ni;
datalines;
Conservative and Unionist Party,635,365,0
Labour Party,631,202,0
Liberal Democrats,611,11,0
Scottish National Party (SNP),59,48,0
Green Party,472,1,0
Democratic Unionist Party - D.U.P.,17,8,1
Sinn Féin,15,7,1
Plaid Cymru - The Party of Wales,36,4,0
Alliance - Alliance Party of Northern Ireland,18,1,1
SDLP (Social Democratic & Labour Party),15,2,1
;
run;
Finally, I created a format for displaying the generally accepted short names of the parties rather than their often-cumbersome official names.
proc format;
value $shortname
'Conservative and Unionist Party'="Conservative"
'Labour Party'="Labour"
'Liberal Democrats'="Lib Dems"
'Scottish National Party (SNP)'="SNP"
'Green Party'="Green"
'Democratic Unionist Party - D.U.P.'="DUP"
'Plaid Cymru - The Party of Wales'="Plaid Cymru"
'Alliance - Alliance Party of Northern Ireland'="Alliance"
'SDLP (Social Democratic & Labour Party)'="SDLP";
run;
After this extensive data preparation phase I was finally ready to embark on my analysis. I ran Proc Means to get the total expenditure for each party by category as I wanted to compare what the big spenders had spent all their money on.
proc means data=allparties noprint;
class RegulatedEntityName ExpenseCategoryName;
var Total;
output out=alltots sum=category_total;
run;
Firstly, I decided to look at parties total spending over all categories using Proc SGPlot
ods graphics / reset;
proc sgplot data=alltots(where=(_type_=2));
format category_total comma10.2 RegulatedEntityName $shortname.;
title1 "UK General Election 2019";
title2 "Total Spending by Party";
footnote j=r "Data From: The Electoral Commission";
hbar RegulatedEntityName / response=category_total
datalabel datalabelattrs=(weight=bold) categoryorder=respdesc;
xaxis grid label="Total Spending GBP";
yaxis grid label='Party Name';
run;
There was a huge surprise here – the top spenders were the Conservatives which wasn’t unexpected since they formed the outgoing Government but in second place, instead of the official opposition Labour Party, were the Liberal Democrats. In the previous election they had won only 12 seats as opposed to Labours 262 and yet in 2019 Labour were outspent by them by over £3m. This was so surprising that my initial response was that I must have made a mistake and I rechecked the source data and my workings. When they seemed correct, I verified the fact with a third-party source.
I then wanted to discover what the parties had spent their money on and whether extra spending had won them extra votes. I merged the output from the Proc Means with the results file I had created earlier, filtered out some record types I didn’t want and used Proc SGPlot again to display how much the parties had spent in each category. This time I excluded the Northern Ireland Parties because their campaigns, as previously mentioned, are very different to the rest of the UK.
proc sort data=alltots;
by RegulatedEntityName;
run;
proc sort data=results;
by RegulatedEntityName;
run;
data merged;
merge alltots(where=(_type_>1)) results;
by RegulatedEntityName;
cpc=category_total/candidates;
cpw=category_total/winners;
run;
data catsonly;
set merged(where=(_type_=3));
run;
ods graphics / reset;
proc sgplot data=catsonly(where=(NI=0));
format category_total comma10.2 RegulatedEntityName $shortname.;
label ExpenseCategoryName="Expense Category";
title1 "UK General Election 2019";
title2 "Spending by Category by Party (Excluding Northern Ireland)";
footnote j=r "Data From: The Electoral Commission";
vbar RegulatedEntityName / response=category_total group=ExpenseCategoryName
categoryorder=respdesc groupdisplay=cluster;
xaxis grid label="Party Name";
yaxis grid label="Total Spending GBP";
run;
There was another big surprise here in that the Liberal Democrats spent far more than any other party on “Unsolicited material to electors”. This consists mainly of flyers and letters posted to households and is a very traditional way of communicating with voters. By contrast Labour concentrated their spending on advertising and the Conservatives spent a lot on market research and canvassing.
The acid test of all this spending is – ‘Does it work?”. To find out I ran two more Proc SGPlots charting the spending by candidate and the spending by successful candidate.
proc sgplot data=merged(where=(_type_=2 and NI=0));
format cpc comma10.2 RegulatedEntityName $shortname.;
title1 "UK General Election 2019";
title2 "Total Spending per Candidate by Party";
footnote j=r "Data From: The Electoral Commission";
hbar RegulatedEntityName / response=cpc
datalabel datalabelattrs=(weight=bold) categoryorder=respdesc;
xaxis grid label="Total Spending GBP";
yaxis grid label='Party Name';
run;
proc sgplot data=merged(where=(_type_=2 and NI=0));
format cpw comma10.2 RegulatedEntityName $shortname.;
title1 "UK General Election 2019";
title2 "Total Spending per Successful Candidate by Party";
footnote j=r "Data From: The Electoral Commission";
hbar RegulatedEntityName / response=cpw
datalabel datalabelattrs=(weight=bold) categoryorder=respdesc;
xaxis grid label="Total Spending GBP";
yaxis grid label='Party Name';
run;
Amazingly each successful candidate cost the Liberal Democrats over £1.3m and the Green Party spent £476,000 getting one candidate elected. The most efficient spending was by the SNP with every MP returned costing only about £21,500.
In conclusion I think we can say that while money undoubtedly helps (the biggest spenders still won the election) it certainly doesn’t guarantee success. Moreover, it could be that the day of unsolicited mail has passed (in our house it nearly all goes, unread, straight into the recycling bag) and possibly market research and paid advertising is the way to go.
There is a huge amount of other information in this data with many more avenues to explore so why not have a look and try to find some more insights.
Did you find something else interesting in this data? Share in the comments. I’m glad to answer any questions.
Hit the orange button below to see all the Free Data Friday articles.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.