I am looking to pull out observations with a specific ICD9 code "493" from either the variable PrincipleDiagnosis or any of the SecondaryDiagnosis2-SecondaryDiagnosis20. I have this array for the secondary diagnosis:
data want;
set have;
if principaldiagnosis=:493;
array secondarydiagnosis(*) secondarydiagnosis2-secondarydiagnosis20;
do i= 1 to dim(secondarydiagnosis);
if secondarydiagnosis(i)=: '493' then do;
output;
end;
end;
run;
However, how can I also get this to check the variable PrincipalDiagnosis...
if principaldiagnosis=:'493'
in the same data step?
Thanks for your help!!
@jenim514 wrote:
or like this?
data want;
set have;
array diagnosis(*) primarydiagnosis secondarydiagnosis2-secondarydiagnosis20;
do i= 1 to dim(diagnosis);
if diagnosis(i)=: '493' then do;
output;
end;
end;
run;
Yes, you can just add the variable in to the list. Arrays are short cuts, nothing more, nothing less.
Why separate it? Put the primary in your diagnosis list for secondary and include it in the loop.
You can also access it independently if needed by name.
I wish I understood arrays better...not sure how, where to include principaldiagnosis in the loop since it is a different variable.
data asthma_only;
set champs_j.table_C_MDR;
array primarydiagnosis secondarydiagnosis(*) secondarydiagnosis2-secondarydiagnosis20;
do i= 1 to dim(secondarydiagnosis);
if secondarydiagnosis(i)=: '493' then do;
if principaldiagnosis=:'493' then do;
output;
end;
end;
end;
run;
=== ERROR!!
-
or like this?
data want;
set have;
array diagnosis(*) primarydiagnosis secondarydiagnosis2-secondarydiagnosis20;
do i= 1 to dim(diagnosis);
if diagnosis(i)=: '493' then do;
output;
end;
end;
run;
@jenim514 wrote:
or like this?
data want;
set have;
array diagnosis(*) primarydiagnosis secondarydiagnosis2-secondarydiagnosis20;
do i= 1 to dim(diagnosis);
if diagnosis(i)=: '493' then do;
output;
end;
end;
run;
Yes, you can just add the variable in to the list. Arrays are short cuts, nothing more, nothing less.
Also consider this. If multiple diagnoses all begin with "493" you will output the same observation multiple times. To prevent that, you could add anothe statement immediately following the output:
output;
delete;
Array is nothing. It's just a of group of variables that you want perform similar operations on it.
It is also possible to test the whole array in one go, for example like below.
The first data step runs twice as fast as the second one.
data HAVE;
retain PRIMARYDIAGNOSIS '495' SECONDARYDIAGNOSIS8 '493';
do I =1 to 1e7; output; end;
drop I;
run;
data WANT;
array DIAGNOSIS(*) $3 PRIMARYDIAGNOSIS SECONDARYDIAGNOSIS2-SECONDARYDIAGNOSIS20;
DSID=open('HAVE');
call set (DSID);
do until(RC);
RC=fetch(DSID);
if RC=0 and whichc('493', of DIAGNOSIS[*]) then output;
end;
RC=close(DSID);
run;
data WANT;
array DIAGNOSIS(*) $ PRIMARYDIAGNOSIS SECONDARYDIAGNOSIS2-SECONDARYDIAGNOSIS20;
set HAVE;
do I=1 to dim(DIAGNOSIS);
if DIAGNOSIS[I] =:'493' then do;
output;
leave;
end;
end;
run;
More performance tips in
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.