BookmarkSubscribeRSS Feed

Exploring Commuting Patterns with SAS

Started ‎04-17-2020 by
Modified ‎08-03-2021 by
Views 3,433

Editor's note: SAS programming concepts in this and other Free Data Friday articles remain useful, but SAS OnDemand for Academics has replaced SAS University Edition as a free e-learning option. Hit the orange button below to start your journey with SAS OnDemand for Academics:

 

Access Now

 

 

Every day at 5 pm the UK government holds a televised press conference on the situation with regard photography-of-people-at-train-station-1311544.jpgto the current coronavirus pandemic. They always show three time series graphs - the first being the daily amount of traffic across  various modes of transport, in order to demonstrate the effectiveness of the lockdown.

 

In this edition of Free Data Friday we will be looking at data from the Census Transportation Planning Products Program (CTPP) to explore the popularity of commuting methods in the US.

 

Get the Data

 

The data can be downloaded from this page on the CTPP web site in a choice of formats. I chose to FreeDataFriday_graphic.jpgdownload it as a comma separated file (CSV) and used Proc Import to bring the data into a SAS data set like this

 

Get started with SAS OnDemand for Academics

 
In this 9-minute tutorial, SAS instructor @DomWeatherspoon shows you how to get your data into SAS OnDemand for Academics and other key steps:
 

Get Started

 

 

 

filename reffile  '/folders/myshortcuts/Dropbox/Means of Transportation.csv';

proc import datafile=reffile
	dbms=csv
	out=means
	replace;
	getnames=yes;
	guessingrows=2000;
	datarow=4;
run;

You will notice that as usual I used a large value for the guessingrows statement and that the data starts at row 4.

 

 

Getting the Data Ready

 

There were a few issues with the file after the import, in particular:

 

  1. The variable names weren't very meaningful; and
  2. The number of commuters was a character field instead of numeric; and
  3. There were some unwanted rows e.g. rows for margin of error and total rows

I decided to fix these with a data step

 

data means_clean(drop=var3 var4);
	length commute_num 8.;
	set means(rename=(A102106___Means_of_transportati=state
		var2=mode));
	commute_num=input(var4,comma32.);
	if var3="Estimate" and mode ne "Total, means of transportation" then output;
run;

This gave me a file looking like this

 

DS1.png

 

The Results

 

I decided to use the data to discover which state had the largest number of commuters for each mode of travel. This was quite straightforward using Proc Sort, Proc Rank and Proc SQL

 

proc sort data=means_clean out=sorted;
	by mode;
run;

proc rank data=sorted out=ranked_mode descending;
	by mode;
	var commute_num;
	ranks ranking;
run;

proc sql;
	select state, mode,commute_num
	from ranked_mode
	where ranking=1;
quit;

Here is the output from the Proc SQL call

 

Results1.png

 

Not surprisingly the list is dominated by the two huge states in terms of population, California and New York. Interestingly by far the most popular mode of transport is driving alone in a vehicle.It would seem that this method is a prime target for authorities to target to try to encourage car pooling and thereby reduce congestion, global warming and pollution.

 

I then decided to see what else I could discover with this data set by ranking the states according to the position of bicycle riding in their own "league table" of methods

 

proc sort data=means_clean out=sorted;
	by state mode;
run;

proc rank data=sorted out=ranked_state descending;
	by state;
	var commute_num;
	ranks ranking;
run;

proc sql;
	select state, mode,ranking, commute_num
	from ranked_state
	where mode="Bicycle"
	order by ranking;
quit;

This gives

 

Results2.png

 

We can see that the winner here is Montana where bicycle riding is the fifth most popular method of commuting. I confess that I have no idea why this would be the case so if you know please leave a message in the comments below.

 

There is a lot more you could do with the data in this and other tables from from the CTPP so why not download SAS University Edition and some data and give it a try yourself? If you have any problems just leave a question in the appropriate SAS Communities Forum.

 

Now it's Your Turn!

 

Did you find something else interesting in this data? Share in the comments. I’m glad to answer any questions.

 

Visit [[this link]] to see all the Free Data Friday articles.

 

Version history
Last update:
‎08-03-2021 09:51 AM
Updated by:

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags