I'm working with ICD-10/ICD-10M codes, with about 22 different variables to pull the codes from.
I want to create 2-level indicator variable for each of the main ICD groups, so that if a record has any of the diagnosis codes, they'll fall into the respective bucket. For example:
'A00' - 'B99' = 'Intestinal Infectious Disease'
But what I'm doing is only returning DX_intest = 0.
ARRAY old [22] h_Diag1 - h_Diag18 h_Admit_Diag_Cd h_Patient_Reason_Visit1 - h_Patient_Reason_Visit3 ;
Do i = 1 to 22;
If substr(OLD(i),1,3) in ("A:") OR substr(OLD(i),1,3) in ("B:") then DX_intest=1;
Else DX_all=0;
end;
If you are trying to find "begins with" don't use
in ("A:")
that tries to match 3 selected characters with exactly 2 and includes the : . So nothing matches.
Consider:
data example; input code $; if code in: ('A' 'B') then put 'Found ' code=; datalines; A00456 B12456 C00000 ;
However you current program would return only the value of the LAST variable compared: h_Patient_Reason_Visit3
So you might want:
ARRAY old [22] h_Diag1 - h_Diag18 h_Admit_Diag_Cd h_Patient_Reason_Visit1 - h_Patient_Reason_Visit3 ; dx_all=0; Do i = 1 to 22; dx_intest= substr(OLD(i),1,3) in: ("A" "B");
if DX_intest=1 then do;
dx_all = .; leave; end; end;
You don't say what dx_all is supposed to represent so I am guessing on that point that you don't want dx_all to be 0 when you find one of the codes.
The LEAVE instruction exits a do loop (caution with nested loops as to which it leaves). So when you find the first A or B this sets the value of DX_intest to 1 then ends the loop and stops searching. A side effect of this code is that the variable i will have the index of the variable in the array where the value was found.
SAS will return 1 for True comparisons and 0 for false. So if none of the variables have the condition true the result at completion of the array tested will be 0.
To detect ANY set the result FALSE before the loop. The loop over the array storing the result of the test, stopping when it is TRUE.
dx_intest=0;
do i = 1 to dim(old) while (not dx_intest);
dx_intest= ( old[i] in: ('A' 'B') );
end;
Or even easier just stop once it is TRUE.
do i = 1 to dim(old) until(dx_intest);
dx_intest= ( old[i] in: ('A' 'B') );
end;
Move the zero assignment out of the loop:
DX_intest = 0;
do i = 1 to 22;
if substr(OLD{i},1,1) eq "A" or substr(OLD{i},1,1) eq "B" then DX_intest = 1;
end;
Note that you can not use wildcards to build a list for the IN operator. All values in the IN list must be explicitly coded. If necessary, you can store all wanted ICD codes in a dataset and use it to build code dynamically.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.