I have the following code:
Data SSN1;
set ssn;
if lastfour_ssn = 0 then SSN = 1;
if lastfour_ssn = 9999 then ssn= 2;
if lastfour_ssn >9999 or lastfour_ssn<1000 then ssn=3;
if lastfour_ssn= . then ssn= 4;
else ssn=5;
run;
My data has a bunch of three-digit numbers, or 0 value, but in the new dataset, ssn only has value of 4 or 5.
Where did I do wrong?
Thanks.
Since LASTFOUR_SSN is either missing or it isn't any value assigned to SSN by the first three IF statement is overwritten. So you essentially ran this code:
data SSN1;
set ssn;
if lastfour_ssn= . then ssn= 4;
else ssn=5;
run;
Add more ELSE keywords so that only one of the five assignment statements will run for any given observation.
Follow this pattern:
if ... then ...;
else if ... then ... ;
else if ... then ...;
else ...;
Without having any actual value it is likely because you are reassigning values.
Try this:
Data SSN1; set ssn; if lastfour_ssn = 0 then SSN = 1; else if lastfour_ssn = 9999 then ssn= 2; else if lastfour_ssn >9999 or (0< lastfour_ssn<1000) then ssn=3; else if lastfour_ssn= . then ssn= 4; else ssn=5; run;
You code when it got to the comparison for missing and they weren't went to assign every thing to 5.
If the 1, 2, or 3 codes are assigned you do not want to check if it the value is missing.
Edited the SSN=3. MISSING is less than anything so without a restriction you would never get the 4 result.
Since LASTFOUR_SSN is either missing or it isn't any value assigned to SSN by the first three IF statement is overwritten. So you essentially ran this code:
data SSN1;
set ssn;
if lastfour_ssn= . then ssn= 4;
else ssn=5;
run;
Add more ELSE keywords so that only one of the five assignment statements will run for any given observation.
Follow this pattern:
if ... then ...;
else if ... then ... ;
else if ... then ...;
else ...;
Hello @GingerJJ
You have a contradiction in your logic. if lastfour_ssn<1000 then ssn=3. and zero is less than 1000.
The appropriate approach would be to use suggestion from @ballardw with appropriate modifications to your logic.
Needless to say that last four digits of ssn will always be four characters (0000-9999) or numeric values between 0 and 9999 and there is hardly any scope for greater than 9999.
Thank you! It solved the problem.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.