BookmarkSubscribeRSS Feed

Analyze United States Airport Claims Data: Programming 1 Case Study

Started ‎09-27-2018 by
Modified ‎09-28-2018 by
Views 11,676

 

Analyze United States Airport Claims Data

 
Have you ever been to an airport in the United States and your luggage was lost or you experienced an incident that involved theft, property damage, personal injury or even death? If so, you can file a claim and a record of that incident will be documented by the United States Transportation Security Administration (TSA).

 

In this case study you will apply your knowledge from the SAS Programming 1 course to Access, Explore, Prepare, Analyze and Export over 200,000 TSA Claims from 2002 – 2017. You will run through the SAS programming process to investigate these claims that contain information like incident date, date received, airport, airport location, claim type, and claim site. Using this information you can answer questions like:

 

  • Which year had the most claims?
  • What type of claims happen most often?
  • Which state has the most claims?
  • How are claims typically settle?

 

When working on this case study you might be thinking "what do I do if I get stuck?" That's an excellent question! Let's discuss the ways you can attempt the case study to best suit your specific needs:

 

  1. If you feel comfortable with the topics in the SAS Programming 1 course and want to treat this as a real-world problem, read the Introduction, Data Layout and Requirements sections of the PDF and begin! During the process feel free to use your notes, Google, or SAS Documentation. When you are done look at the Results section in the PDF to compare your results to the results provided.
  2. If you think you might need a bit of assistance in the case study use the Assignment Guide section to help guide you. The guide will not give you the solutions, but more of a road map on how to solve problem.
  3. If you are new to SAS and want to use the case study as more of a demo feel free to do so. You can use the Assignment Guide and Hints section to give you a suggested solution to each specific problem. I recommend after your run through the case study as a demo to go back and attempt it on your own.

 

If you have any questions regarding the case study, if you complete the case study and come up with different solutions and want to show them off, or if you want to post additional visualizations or analysis of the data create a post in the SAS Programming 1 and 2 ForumWe would love to hear from you!

 

To attempt this case study, you will need to download the:

  • SAS Programming 1: Essentials Case Study PDF
  • TSAClaims2002_2017.csv file

 

 

Be curious. What other questions can you answer with this data? 

Comments

Hi Guys,

 

I'm completely new to SAS i would like to switch to SAS kindly give valuable information and suggestion to get hands on experience like career advises, how to learn better ways.

 

looking forward to seeing from you.

 

Sreeni.

@srenivas661,

 

Hello. I would start by taking the free SAS® Programming 1: Essentials e-learning course. That course can be found here.

 

On that page you can also search for in person classes or live web instruction.

 

- Peter

Hi guys, I am attempting the case study on the Airports Claim data and my proc import step gives me an error that the physical file does not exist, and a direct upload does not work because the file is bigger than the default size. I have converted the file to a csv so far because I downloaded it as an xlsx file. 

here is my import code 

libname TSA "/folders/myshortcuts/EPG194/OUTPUT";
%let path=C:\Users\MALTEZ\Desktop;
run;
options validvarname=v7;
proc import datafile="&path\TSAClaims2002_2017.csv"
dbms=csv
out=TSA.importedclaims
replace;
guessingrows=max;
run; 

.please help 

Hi @NicciCode ,

 

Seems like you are using SAS University Edition? If you are your %let path seems incorrect. SAS is searching for that file and can't find it. You have to make sure you place the TSA Claims.csv file in your myfolders of SAS University Edition. 

 

Try this. Make sure you can see the TSA Claims CSV file in SAS University Edition. Go to your files and folders on the left navigation pain and right click on the the folder where the Claims CSV data resides and select properties. Copy that location.Use that location for %let path=. I added a screenshot of how I imported it.

 

Import TSA.jpg

 

Be aware when using Guessingrows=max SAS will scan every row in every column to determine exactly the column length and sizes you need. This will take about 2 minutes or so. Another method would be to use the data step with the infile statement. You will actually see this code in the log after PROC IMPORT completes. You can look into that further if you would like. That will import much faster, but you have to specify everything. I'd rather let SAS do the work for me.

 

Again the key is to make sure you can see the TSA Claims CSV file in your environment first. Then follow my steps.

 

Keep me posted.

 

- Peter

 

 

 

Got it. I did the import from my VM and it worked. I was initially trying to upload it from my SAS studio but it could not because of the file size. Also, I have looked at the file statement, but I prefer the proc import. Thank You.

Could you be kind enough as to write the infile syntax here. I cannot seem to find my way around it. Thank you. 

 

What's nice about PROC IMPORT is that in the log it will write out the INFILE syntax for you. I'll clean it up a bit and paste it, but you can always check the log and see it yourself. What's nice is you can copy the code in the log.

 

libname tsa '/folders/myfolders/TSA Case Study';

data TSA.IMPORTEDCLAIMS2;
	infile '/folders/myfolders/TSA Case Study/TSAClaims2002_2017.csv' delimiter=',' MISSOVER DSD lrecl=32767 firstobs=2;
	informat    Claim_Number $13.
		         Date_Received best32.
			 Incident_Date best32.
			 Airport_Code $3.
			 Airport_Name $48.
 			 Claim_Type $39.
 			 Claim_Site $15.
	 		 Item_Category $834.
 			 Close_Amount best32.
			 Disposition $23.
			 StateName $17.
 			 State $2.
			 County $20.
 		         City $33.;
	format 	Claim_Number $13.
	 		Date_Received best12.
	 		Incident_Date best12.
	 		Airport_Code $3.
	 		Airport_Name $48.
	 		Claim_Type $39.
	 		Claim_Site $15.
	 		Item_Category $834.
	 		Close_Amount best12.
	 		Disposition $23.
	 		StateName $17.
	 		State $2.
	 		County $20.
	 		City $33.;
	input 
            Claim_Number  $
            Date_Received 
            Incident_Date 
	    Airport_Code  $
            Airport_Name  $
            Claim_Type  $
            Claim_Site  $
            Item_Category  $
            Close_Amount 
            Disposition  $
            StateName  $
            State  $
            County  $
            City  $;
run;

- Peter

 

 

hai

why does the claim_number appear like 2.00608E+12?

is it correct?

 

It's the way they were exported to CSV. Shouldn't be an issue once you import the data to SAS.

 

options validvarname=v7;
proc import datafile="/folders/myfolders/myprograms/TSAClaims2002_2017.csv"
dbms=csv out=tsa.claims_cleaned replace;
guessingrows=max;
run;

/* proc print data=tsa.claims_cleaned (obs=20); */
/* run; */
/* proc contents data=work.tsa_data varnum; */
/* run; */

proc freq data=tsa.claims_cleaned;
tables claim_site claim_type / nocol nocum;
tables Date_received Incident_date / nocol nocum;
format date_received Incident_date date9.;
run;

proc sort data=tsa.claims_cleaned
nodupkey out=tsa.claims_dupout ;
by _all_;
run;

proc sort data=tsa.claims_dupout;
by Incident_Date;
run;

data tsa_data;
set tsa.claims_cleaned;
if claim_type = "-" or claim_type = " " then Claim_type = "Unknown";
if claim_site = "-" or claim_site = " " then Claim_site = "Unknown";
length disposition $ 23;
if disposition = "-" or disposition = " " then disposition = "Unknown";
Statenames = propcase(Statename);
State = upcase(State);
format Date_Received date9.;
format Incident_date date9.;
if (Date_Received = " " or Incident_date = " ") or
(year(Date_Received) < 2002 and year(Incident_date) > 2017) or
(Incident_date > Date_recieved)
then Date_Issues = "Needs Review";
drop country city;
format Close_amount dollar10.2;
format Date_received Incident_date ddmmmyy.;
run;

ods pdf file=claimReports pdftoc=1 style=meadow startpage=no;
proc print data=tsa.claims_cleaned;
run;
ods pdf close;

I had a question how we can display our zone time and date separately in the results or in the log with a macro variable like systime or systimezone or anyother if available.....?

I have used the following code to import file, is there anything wrong with this?

data tsaraw;
infile "&path/tsaclaims2002_2017.csv" dlm="," dsd missover firstobs=2 ;
input Claim_Number :$13.
Date_Received
Incident_Date
Airport_Code :$3.
Airport_Name :$30.
Claim_Type :$35.
Claim_Site :$20.
Item_Category :$600.
Close_Amount :dollar.
Disposition :$25.
StateName :$propcase20.
State :$upcase2.
County :$30.
City :$30.
;
run;

proc sql;
select distinct(incident_date) from tsaraw;
quit;
This returns some date like following:
19700.92
What does this mean?
 
 

 

 

The date which it has returned is because of you haven't mentioned any
format for your column Incident_Date So,the value which is displayed to you
is the no of days from Base date(i.e 01 Jan 1960)

Thanks Anurag12 for your reply. But the source data (in csv file) is in sas format and numeric that is why I did not provide any format. Looks like dates are correct. Also the informat propcase for state did not work, while upcase for state worked.

Where can I find the answers to verify my work.

TsaRaw: 220,855 rows

TsaUnique claim#: 220,828 rows

TsaClean after putting all requirements: 219441 rows

I got 4007 date issues.

 

 

 

 

 

Propcase didn't work beacuse i think that there is no format as such and just try giving a date format to your date column

and why don't you specify a informat statement which will tell sas how to read the data in that specific  column

From what I learnt, Informat is needed if the input data is in non-standard form (like 03/12/2015). I wanted to keep the same sas date  format to verify my import data.

Anyone, please let me know where is the solution.

Thanks.

Finally, I found the solutions.

Here is the data to verify:

Total imported rows: 220855.

Total dup rows: 5.

Total Date issues: 4241.

If you get 4241 rows that means all your requirements are met.

Good luck!

Great..✌️

Hello, 

 

I fairly new to the SAS Environment and when i attempt to set up my case study i receive the following errors: 

ERROR: Error connecting to 72.247.205.215:443. (The connection has timed out.)
ERROR: Unable to establish connection to support.sas.com.
ERROR: Unable to connect to Web server.
NOTE: PROCEDURE HTTP used (Total process time): real time 2:07.38 user cpu time 0.10 seconds system cpu time 0.12 seconds memory 333.34k OS Memory 44704.00k Timestamp 04/04/2023 12:21:00 PM Step Count 63 Switch Count 195 Page Faults 11 Page Reclaims 147 Page Swaps 0 Voluntary Context Switches 6970 Involuntary Context Switches 2 Block Input Operations 1032 Block Output Operations 8 NOTE: The SAS System stopped processing this step because of errors. 67 68 /*  Save TSAClaims2022_2017.csv in the WORK library path */ 69 data _null_; 70 infile data; 71 input; 72 file "&path/TSAClaims2002_2017.csv"; 73 put _infile_ ; 74 run; ERROR: Physical file does not exist, /sas/grid/SAS_work718D0001686A_svcksa69901a7z/#LN00079. NOTE: The file "/sas/grid/SAS_work718D0001686A_svcksa69901a7z/SAS_workA1DA0001686A_svcksa69901a7z/TSAClaims2002_2017.csv" is: Filename=/sas/grid/SAS_work718D0001686A_svcksa69901a7z/SAS_workA1DA0001686A_svcksa69901a7z/TSAClaims2002_2017.csv, Owner Name=malop14,Group Name=ux_R2002430, Access Permission=-rw-rw----, Last Modified=04Apr2023:10:21:00 NOTE: 0 records were written to the file "/sas/grid/SAS_work718D0001686A_svcksa69901a7z/SAS_workA1DA0001686A_svcksa69901a7z/TSAClaims2002_2017.csv". 3

However, i am able to copy the hyperlink and it downloads directly, does anyone know what i can do to set this up versus having to download from csv from the site directly?

Version history
Last update:
‎09-28-2018 10:05 AM
Updated by:

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Article Labels
Article Tags