Hi, I am trying to follow the instructions in the SAS beginner programming course on the TSA Claims Data. I am filtering through the data as they proposed, but whenever I try to condition on the dates being missing or '.' to make them be 'needs review', it replaces all of the 220k rows with it, instead of just the missing ones:
This is what they have:
This is what I have:
data work.tsa_report;
set work.tsa_nodupkey;
retain Date_Issues;
/* 3. Clean the Claim_Site column. */
if Claim_Site in('-' '') then Claim_Site='Unknown';
/* 4. Clean the Disposition column. */
if Disposition in('-' '') then Disposition='Unknown';
else if Disposition in('Closed: Contractor Claim' 'losed: Contractor Claim') then
Disposition='Closed:Contractor Claim';
else if Disposition='Closed: Canceled' then Disposition='Closed:Canceled';
/* 5. Clean the Claim_Type column. */
if Claim_Type in('-' '') then Claim_Type='Unknown';
else if Claim_Type in('Passenger Property Loss/Personal Injur'
'Passenger Property Loss/Personal Injury') then
Claim_Type='Passenger Property Loss';
else if Claim_Type='Property Damage/Personal Injury' then
Claim_Type='Passenger Damage';
/* 6. Convert all State values to uppercase and all StateName values to proper case. */
StateName=propcase(StateName);
State=upcase(State);
/* 7. Create a new column to indicate date issues. */
if (Incident_Date > Date_Received or
Date_Received = . or
Incident_Date = . or
year(Incident_Date) < 2002 or
year(Incident_Date) > 2017 or
year(Date_Received) < 2002 or
year(Date_Received) > 2017) then Date_Issues='Needs Review';
/* 8. Add permanent labels and formats. */
format Date_Received Incident_Date date9. Close_Amount dollar20.2;
label Claim_Number='Claim Number'
Incident_Date='Incident Date'
Date_Received='Date Received'
Airport_Code='Airport Code'
Airport_Name='Airport Name'
Claim_Type='Claim Type'
Claim_Site='Claim Site'
Item_Category='Item Category'
Close_Amount='Close Amount'
Date_Issues='Date Issues';
/* 9. Drop County and City. */
drop County City;
run;
I don't know why it's doing this as I had this problem before.
How are the variables defined? Make sure those are the actual variable names. Make sure they are actually numeric variables that contain date values. Not character variables. Not datetime values.
Try looking at the current values.
proc freq data=tsa_nodupkey ;
tables Date_Received*Incident_Date / list;
format Date_Received Incident_Date year4.;
run;
Make sure you actually have some cases that fall into the years 2002 to 2017.
Note that if the variables contain DATETIME values then the YEAR() function will return a missing value, which in comparison operations will be considered smaller than any actual number.
1 data _null_; 2 x=datetime(); 3 if year(x) < 2017 then put 'Less than 2017 ' x= datetime19. x=dtdate9. x=dtyear4.; 4 run; NOTE: Invalid argument to function YEAR(2071614560.5) at line 3 column 6. Less than 2017 x=24AUG2025:00:29:21 x=24AUG2025 x=2025 x=2071614560.5 _ERROR_=1 _N_=1 NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values. Each place is given by: (Number of times) at (Line):(Column). 1 at 3:6 NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
Try recoding the IF condition to be simpler. You seem to want this condition.
if not (Incident_Date <= Date_Received
and year(Incident_Date) in (2002:2017)
and year(Date_Received) in (2002:2017)
) then Date_Issues='Needs Review';
It's often helpful to run PROC FREQ to check the values of your variables.
If you run:
prof freq data=work.tsa_nodupkey;
tables Incident_Date*Date_Received/missing list;
format Incident_Date Date_Received year.;
run;
are the results informative?
thank you! it turns out I had accidentally added a 'retain' command which preserved values for every row
Calling all data scientists and open-source enthusiasts! Want to solve real problems that impact your company or the world? Register to hack by August 31st!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.