I have the following code:
Data SSN1;
set ssn;
if lastfour_ssn = 0 then SSN = 1;
if lastfour_ssn = 9999 then ssn= 2;
if lastfour_ssn >9999 or lastfour_ssn<1000 then ssn=3;
if lastfour_ssn= . then ssn= 4;
else ssn=5;
run;
My data has a bunch of three-digit numbers, or 0 value, but in the new dataset, ssn only has value of 4 or 5.
Where did I do wrong?
Thanks.
Since LASTFOUR_SSN is either missing or it isn't any value assigned to SSN by the first three IF statement is overwritten. So you essentially ran this code:
data SSN1;
set ssn;
if lastfour_ssn= . then ssn= 4;
else ssn=5;
run;
Add more ELSE keywords so that only one of the five assignment statements will run for any given observation.
Follow this pattern:
if ... then ...;
else if ... then ... ;
else if ... then ...;
else ...;
Without having any actual value it is likely because you are reassigning values.
Try this:
Data SSN1; set ssn; if lastfour_ssn = 0 then SSN = 1; else if lastfour_ssn = 9999 then ssn= 2; else if lastfour_ssn >9999 or (0< lastfour_ssn<1000) then ssn=3; else if lastfour_ssn= . then ssn= 4; else ssn=5; run;
You code when it got to the comparison for missing and they weren't went to assign every thing to 5.
If the 1, 2, or 3 codes are assigned you do not want to check if it the value is missing.
Edited the SSN=3. MISSING is less than anything so without a restriction you would never get the 4 result.
Since LASTFOUR_SSN is either missing or it isn't any value assigned to SSN by the first three IF statement is overwritten. So you essentially ran this code:
data SSN1;
set ssn;
if lastfour_ssn= . then ssn= 4;
else ssn=5;
run;
Add more ELSE keywords so that only one of the five assignment statements will run for any given observation.
Follow this pattern:
if ... then ...;
else if ... then ... ;
else if ... then ...;
else ...;
Hello @GingerJJ
You have a contradiction in your logic. if lastfour_ssn<1000 then ssn=3. and zero is less than 1000.
The appropriate approach would be to use suggestion from @ballardw with appropriate modifications to your logic.
Needless to say that last four digits of ssn will always be four characters (0000-9999) or numeric values between 0 and 9999 and there is hardly any scope for greater than 9999.
Thank you! It solved the problem.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.