BookmarkSubscribeRSS Feed

The Trump 2020 campaign: A look at contributions with SAS

Started ‎06-14-2019 by
Modified ‎08-03-2021 by
Views 3,145

SAS programming concepts in this and other Free Data Friday articles remain useful, but SAS OnDemand for Academics has replaced SAS University Edition as a free e-learning option. Hit the orange button below to start your journey with SAS OnDemand for Academics:

 

Access Now

 

As the process for electing the US President in 2020 gets underway with the declaration of candidates for their parties’ nomination, more and more data about that process is becoming available. A great resource for this data is the Federal Election Commission (FEC) and, in this post, we will be examining the source of funds raised by President Trump’s campaign. I chose the Trump campaign because it is, at the time of writing, the campaign with by far the highest level of donations received. The same process could be used for any of the declared candidates.

 

Get the Data

 

You can download the data from the FEC web site as a CSV file. I would strongly recommend only downloading data for one candidate at a time as otherwise the downloaded file will be huge. There are many more variables in the download than are displayed on the FEC visualization and we need to ensure that SAS University Edition can handle a file of the size we intend to use. Failure to do that can result in a corrupted installation forcing you to reinstall SAS UE (trust me, it’s happened to me…).

Get Started with SAS OnDemand for Academics

 
In this 9-minute tutorial, SAS instructor @DomWeatherspoon shows you how to get your data into SAS OnDemand for Academics and other key steps:
 

Get Started

 

Getting the Data Ready

 

Firstly, because the downloaded file doesn’t have a very meaningful name, I renamed it for my convenience. I then used the Import Task to import the file into SAS – I had to use a very large value for the GUESSINGROWS option (68,000) as I discovered that one of the codes switches from being purely numeric to alphanumeric just before that level is reached. After examining the file I decided I didn’t need to make any changes to it in order to successfully carry out my analysis.

 

The Results

 

With no fewer than 77 variables in the file I had to decide which aspects of the data to examine. I decided to focus on where, geographically, the major donors were and what type of donors predominated (individuals, committees, PACs etc). My first step was to run a Proc Means on the data using contributor_state and entity_type as the Class Variables.

 

 

proc means data=campaign.trump noprint;
	class contributor_state entity_type;
	output out=trump_aggs sum=total_contributions
		mean=average_contribution;
	var contribution_receipt_amount;
run;

 

This what the output looks like

 

Proc Means Output.png

 

I decided to create a pie chart using the SGPie Procedure showing the split between entity types. Firstly, however, because I wanted the entity types expanded to show their full description instead of just the abbreviation in the file, I created a custom format.

 

 

proc format;
	value $entity_type
	'IND'='Individual'
	'COM'='Committee'
	'ORG'='Organization'
	'PAC'='Political Action Committee'
	'PTY'='Party';
run;

 

Now I can create the pie chart – notice how I can apply the custom format within the procedure call without first associating it with the variable in the data set. I also use the _type_ variable to determine which combination of class variables to chart.

 

 

ods graphics / reset;
title1 'Contributions to Trump Campaign 2020';
title2 'Total Contributions by Entity Type';
footnote j=l 'Data from https://www.fec.gov';
proc sgpie data=trump_aggs(where=(_type_=1));
	format total_contributions dollar11.0
		entity_type $entity_type.;	
	pie entity_type / response=total_contributions dataskin=gloss datalabeldisplay=all;
run;

 

This what the generated Pie chart looks like

 

Pie Chart.png

 

We can see that exactly two thirds of the campaign donations are from committees with a further 30% from individuals. The rest comes from all other categories combined. I checked the source file and discovered that there are only 6 contributors in the committee group and the “donations” appear to be transfers in from affiliated committees. But turns out they aren't. Therefore, that seems a very profitable line for further enquiry, so I decided to concentrate on the donations from individuals.

 

I was curious about where the majority of donations came from, so I created a horizontal bar chart showing the total dollar amount of contributions by state. Before doing that, however, I built another custom format that converted the state codes into full state names.

 

 

/* Taken from http://support.sas.com/kb/25/301.html */

proc format;
	value $statename
	'AL'='Alabama'				
	'AK'='Alaska'				
	'AZ'='Arizona'				
	'AR'='Arkansas'				
	'CA'='California'			
	'CO'='Colorado'				
	'CT'='Connecticut'			
	'DE'='Delaware'				
	'DC'='District of Columbia'	
	'FL'='Florida'				
	'GA'='Georgia'				
	'HI'='Hawaii'				
	'ID'='Idaho'				
	'IL'='Illinois'				
	'IN'='Indiana'				
	'IA'='Iowa'					
	'KS'='Kansas'				
	'KY'='Kentucky'				
	'LA'='Louisiana'			
	'ME'='Maine'				
	'MD'='Maryland'				
	'MA'='Massachusetts'		
	'MI'='Michigan'				
	'MN'='Minnesota'			
	'MS'='Mississippi'			
	'MO'='Missouri'				
	'MT'='Montana'				
	'NE'='Nebraska'
	'NV'='Nevada'
	'NH'='New Hampshire'
	'NJ'='New Jersey'
	'NM'='New Mexico'
	'NY'='New York'
	'NC'='North Carolina'
	'ND'='North Dakota'
	'OH'='Ohio'
	'OK'='Oklahoma'
	'OR'='Oregon'
	'PA'='Pennsylvania'
	'RI'='Rhode Island'
	'SC'='South Carolina'
	'SD'='South Dakota'
	'TN'='Tennessee'
	'TX'='Texas'
	'UT'='Utah'
	'VT'='Vermont'
	'VA'='Virginia'
	'WA'='Washington'
	'WV'='West Virginia'
	'WI'='Wisconsin'
	'WY'='Wyoming'
	'RQ'='Puerto Rico'
	'GQ'='Guam'
	'99'='Foreign';
run;

 

There are a few things to note in the Proc SGPlot call:

 

  1. in the data set where clause I limited the display to those states where the number of contributors was greater than nineteen. This was important as the source data contains special state codes for overseas and military donations which, although very small in number, might skew the output; and
  2. I’m a big fan of tooltips as I believe that they are an excellent way of adding additional information to a chart without introducing any visual clutter. Proc SGPlot makes it easy to create them – you use the tip, tiplabel and tipformat options in ordered, space separated lists.

Here is the code to generate the chart:

 

 

ods graphics / reset width=8in height=10in imagemap;
title1 'Contributions to Trump Campaign 2020';
title2 'Total Contribution (Individuals) by State';
footnote1 j=l 'Data from https://www.fec.gov';
footnote2 j=l 'Minimum 20 Contributions';
proc sgplot data=trump_aggs(where=(_type_=3 and entity_type="IND"
		and _freq_>19));
	hbar contributor_state /  response=total_contributions
		categoryorder=respdesc
		dataskin=pressed fillattrs=(color=vpab)
		tip=(contributor_state _freq_ total_contributions)
		tiplabel=("State" "No. of Contributors" "Total of Contributions")
		tipformat=($statename. comma8.0 dollar12.0);
	xaxis label="Total Contributions in US$";
	yaxis label="Contributor State" fitpolicy=none;
run;

 

This what the generated chart looks like:

 

First Bar Chart.png

 

There are no great surprises here with the largest amounts raised coming from the big states of Florida, Texas and California.

 

I then charted the average donation amount by state, again using Proc SGPlot:

 

 

ods graphics / reset width=8in height=10in imagemap;
title1 'Contributions to Trump Campaign 2020';
title2 'Average Contribution (Individuals) by State';
footnote1 j=l 'Data from https://www.fec.gov';
footnote2 j=l 'Minimum 20 Contributions';
proc sgplot data=trump_aggs(where=(_type_=3 and entity_type="IND"
		and _freq_>19));
	hbar contributor_state /  response=average_contribution
		categoryorder=respdesc
		dataskin=pressed fillattrs=(color=vpab)
		tip=(contributor_state _freq_ average_contribution) 
		tiplabel=("State" "No. of Contributors" "Average Contribution")
		tipformat=($statename. comma8.0 dollar8.2);
	xaxis label="Average Contribution in US$";
	yaxis label="Contributor State" fitpolicy=none;
run;

 

This was the generated chart:

 

Second Bar Chart.png

 

One big surprise here was that Washington DC had by far the highest average donation – over $500 against $125 for the next most generous state (Ohio). At this point I should add that I know the District of Columbia is not a state, but it’s usually included in these types of analyses.

 

The question, then, is what does this mean? There were only 49 contributions from DC against over 4,000 from Ohio. so the results need to be treated with some caution. Could it be that the sort of people who donate to campaigns are, on average, wealthier in DC than anywhere else, or are capitol residents just more politically committed? More analysis is needed to try to answer that question and the many others that can be generated from this extensive data set.

 

Now it's your Turn!

 

Did you find something else interesting in this data? Share in the comments. I’m glad to answer any questions.

 

Visit [[this link]] to see all the Free Data Friday articles.

Comments

Great article @ChrisBrooks!  Did you know that SAS has a few functions that can convert a state code to a state name?  Check out the STNAME and STNAMEL functions.  To use them in your example, you would probably need a DATA step to recode and still account for foreign sources.

Thanks @ChrisHemedinger you're right - STNAME would have done just as well as the format.

 

I must admit that, being British, I've never seen these "foreign state codes" before. My initial reaction on seeing them in the source data was that they must be errors. Fortunately I googled one of them and discovered what they were - luckily they were too few in number to materially affect the results....

Horrifying.

Great article, Beverly!  Interesting!

Triggered

Version history
Last update:
‎08-03-2021 04:08 PM
Updated by:

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags