Hello SAS Community!
I am hoping someone might have a trick to shorten the following if then statement. I am working in a file that contains 44 diagnosis codes, I need to go through those and make some of the diagnosis codes a new variable with the specific diagnosis (for example femoral neck fracture = yes1/no0), but am not interested in all of the diagnosis codes (there are over 5k) so I don't want to transpose the variable.
What I have been doing is as follows:
data.x; set.x;
if diagcode1 = 'S72.0' then femoralneckfx = 1;
if diagcode2 = 'S72.0' then femoralneckfx = 1;
....(and so on for all 44 diagnosis codes)....
if femoralneckfx NE 1 then femoralneckfx = 0;
run;
then repeating for a similar process for the 9 diagnoses of interest.
I found I could slightly shorten it a little by typing "if diagcode1 = 'S72.0' or diagcode2 = 'S72.0' of diagcode3 = 'S72.0' .......then femoralneckfx = 1;"
But I was wondering if there is a way to write it even shorter, can you look for the text across all 44 variables at the same time so you don't have to repeat the ='S72.0'?
I tried " if 'S72.0' in (diagcode1, diagcode2, diagcode3...) then femoralneckfx = 1; " but that didn't work either.
Thanks for any help you can provide!
Steph
Hello @sjarvis847,
@sjarvis847 wrote:
I tried " if 'S72.0' in (diagcode1, diagcode2, diagcode3...) then femoralneckfx = 1; " but that didn't work either.
The correct syntax for this type of condition is: value IN arrayname.
data want;
set have;
array diagcode[44];
femoralneckfx=('S72.0' in diagcode);
run;
The necessary ARRAY statement could also refer to the variable list diagcode:, assuming that diagcode1 through diagcode44 are the only variables in the input dataset (HAVE) whose names start with "diagcode". Then the array name doesn't need to be diagcode and also the hardcoded "44" can be eliminated:
data want;
set have;
array _dc[*] diagcode:;
femoralneckfx=('S72.0' in _dc);
run;
You could use the WHICHC function. Example given at the link.
data x1;
set x;
femoralneckfx=whichc('S72.0',of diagcode:)>0;
run;
Hello @sjarvis847,
@sjarvis847 wrote:
I tried " if 'S72.0' in (diagcode1, diagcode2, diagcode3...) then femoralneckfx = 1; " but that didn't work either.
The correct syntax for this type of condition is: value IN arrayname.
data want;
set have;
array diagcode[44];
femoralneckfx=('S72.0' in diagcode);
run;
The necessary ARRAY statement could also refer to the variable list diagcode:, assuming that diagcode1 through diagcode44 are the only variables in the input dataset (HAVE) whose names start with "diagcode". Then the array name doesn't need to be diagcode and also the hardcoded "44" can be eliminated:
data want;
set have;
array _dc[*] diagcode:;
femoralneckfx=('S72.0' in _dc);
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.