Hi,
I am trying to filter some of my data and create a new dataset. using the following syntax:
data YYY;
set XXX;
if DeathCause_ICD in ("A923", "A9231","A9232","A9239") THEN disease1=1;
IF condition_rec_1 in ("A923", "A9231","A9232","A9239") then disease1=1;
if condition_rec_2 in ("A923", "A9231","A9232","A9239") then disease1=1;
if condition_rec_3 in ("A923", "A9231","A9232","A9239") then disease1=1;
IF condition_rec_4 in ("A923", "A9231","A9232","A9239") then disease1=1;
if condition_rec_5 in ("A923", "A9231","A9232","A9239") then disease1=1;
if condition_rec_6 in ("A923", "A9231","A9232","A9239") then disease1=1;
IF condition_rec_7 in ("A923", "A9231","A9232","A9239") then disease1=1;
if condition_rec_8 in ("A923", "A9231","A9232","A9239") then disease1=1;
if condition_rec_9 in ("A923", "A9231","A9232","A9239") then disease1=1;
IF condition_rec_10 in ("A923", "A9231","A9232","A9239") then disease1=1;
if condition_rec_11 in ("A923", "A9231","A9232","A9239") then disease1=1;
if condition_rec_12 in ("A923", "A9231","A9232","A9239") then disease1=1;
IF condition_rec_13 in ("A923", "A9231","A9232","A9239") then disease1=1;
if condition_rec_14 in ("A923", "A9231","A9232","A9239") then disease1=1;
if condition_rec_15 in ("A923", "A9231","A9232","A9239") then disease1=1;
IF condition_rec_16 in ("A923", "A9231","A9232","A9239") then disease1=1;
if condition_rec_17 in ("A923", "A9231","A9232","A9239") then disease1=1;
if condition_rec_18 in ("A923", "A9231","A9232","A9239") then disease1=1;
IF condition_rec_19 in ("A923", "A9231","A9232","A9239") then disease1=1;
if condition_rec_20 in ("A923", "A9231","A9232","A9239") then disease1=1;
else delete;
RUN;
While I am not receinving any error in the log, still my syntax not working and giving me 0 observations.
I have used this syntax multiple times and i canot figure out wher it is going wront. any thoughts appreciated.
Also, you can save yourself some typing and avoid possible typographic errors like this:
%let codes="A923", "A9231","A9232","A9239";
data YYY;
set XXX;
if DeathCause_ICD in (&codes)
or condition_rec_1 in (&codes)
or condition_rec_2 in (&codes)
....
or condition_rec_19 in (&codes)
or condition_rec_20 in (&codes) then disease1=1;
else delete;
RUN;
IF condition_rec_19 in ("A923", "A9231","A9232","A9239") then disease1=1;
if condition_rec_20 in ("A923", "A9231","A9232","A9239") then disease1=1;
else delete;
Above is the portion of your code where the problem lies. On the second line above, if this fails because condition_rec_20 is not one of the four values shown, then the next line executes and deletes the record. Everything that is tested previously does not have an impact on this deletion. What I think you want is (I've left out a few lines):
data YYY;
set XXX;
if DeathCause_ICD in ("A923", "A9231","A9232","A9239") THEN disease1=1;
else IF condition_rec_1 in ("A923", "A9231","A9232","A9239") then disease1=1;
else if condition_rec_2 in ("A923", "A9231","A9232","A9239") then disease1=1;
....
else IF condition_rec_19 in ("A923", "A9231","A9232","A9239") then disease1=1;
else if condition_rec_20 in ("A923", "A9231","A9232","A9239") then disease1=1;
else delete;
RUN;
Also, you can save yourself some typing and avoid possible typographic errors like this:
%let codes="A923", "A9231","A9232","A9239";
data YYY;
set XXX;
if DeathCause_ICD in (&codes)
or condition_rec_1 in (&codes)
or condition_rec_2 in (&codes)
....
or condition_rec_19 in (&codes)
or condition_rec_20 in (&codes) then disease1=1;
else delete;
RUN;
yes, the esle if statement did work. thank you so much.
And an alternate approach (untested as no data available)
data yyy; set xxx /* array to allow some shorthand code*/ array rec (*) Deathcause_ICD condition_rec_: ; disease1 = ( whichc('A923' ,of rec(*) )>0 or whichc('A9231',of rec(*) )>0 or whichc('A9232',of rec(*) )>0 or whichc('A9239',of rec(*) )>0 ); if disease1; run;
The WHICHC function searches for the value of the first parameter in a list of values or variables and returns the position in list if found. The Array statement is way to place all the variables to search in a shorthand version. The "of rec(*)" says use all variables in the array for the list to search. There is a companion function for numeric values WHICHN. This sort of list can be used with most functions that accept a list of variables.
The IF keeps observations where Disease1 is 1.
IF there are no codes that start A923 other than the 4 shown you could use the =: 'A923' to compare the variable to see if it begins with those and don't even need the list
data YYY; set XXX; if DeathCause_ICD =: 'A923' or condition_rec_1 =: 'A923' or condition_rec_2 =: 'A923' <etc>
If you have other codes such as A9236 that you do want to delete then this would not be appropriate.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.