BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PhanS
Obsidian | Level 7

Dear Everyone,

I would like to discuss with you three items.

1) 'count' vs. 'array'. I appreciate your comments on this topic - 'count' vs. 'array'.

In this example of HIV cohort, I was looking how many comorbidity that subjects had developed with comorbidity after their initial treatment.

I compared results between 'count' and 'array' in which I received the same results. A colleague of mind thought that 'count' may not be a proper method for analyses. In deeded, I admire array as this method is simple.

Please kindly see my codes and SAS outputs below:

**count**

if dxyear ge arv_startyr then HTN_howmany=count(dxcode, "HTN");

if dxyear ge arv_startyr then DYSLIPID_howmany=count(dxcode, "DYSLIPID");

if dxyear ge arv_startyr then KIDNEY_UN_howmany=count(dxcode, "KIDNEY_UN");

if HTN_howmany=1 or DYSLIPID_howmany=1 or KIDNEY_UN_howmany=1 then comorbid_sta=1;

else comorbid_sta=0;

format comorbid_sta dx_staf.;

**array**

array array_comorbid[3] $20 ('HTN', 'DYSLIPID', 'KIDNEY_UN');

comorbid_number=0;

do I=1 to dim(array_comorbid);

if dxyear ge arv_startyr and dxcode=array_comorbid(I) then comorbid_number=comorbid_number+1;

end;

format comorbid_number dx_staf.;

  

                                 The FREQ Procedure

                                     Cumulative    Cumulative

          comorbid_sta    Frequency     Percent      Frequency      Percent

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

          dx_prior_art        6929       99.91          6929        99.91

          dx_post_art            6        0.09          6935       100.00

             comorbid_               Cumulative    Cumulative

                number    Frequency     Percent    Frequency      Percent

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

          dx_prior_art        6929       99.91          6929        99.91

          dx_post_art            6        0.09          6935       100.00

2) In 'count' method, when I used 'and' instead of 'or', I do not get the results. I do this becuase I wonder a subject may have more than one comorbidity.

if HTN_howmany=1 and DYSLIPID_howmany=1 and KIDNEY_UN_howmany=1 then comorbid_sta=1;

else comorbid_sta=0;

3) Also, I wonder if someone can share with me a simple method replacing 'count' or 'array', and thus I can count over 20 AIDS-defining illness that wrote in a different names with 'string variables', such as 'CANDIDA', 'CMV', 'CRYPTOCO', 'CRYPTSP', ...

Thanks everyone in advance for your time sharing your knowledge and skills in SAS.   

Phan S.

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Just a couple of comments.

Given your small number of subjects/records it won't make a lot of difference, but in both cases you are using a number of redundant calls for dxyear that could be reduced to one statement.  e.g., with your count code, instead of:

if dxyear ge arv_startyr then HTN_howmany=count(dxcode, "HTN");

if dxyear ge arv_startyr then DYSLIPID_howmany=count(dxcode, "DYSLIPID");

if dxyear ge arv_startyr then KIDNEY_UN_howmany=count(dxcode, "KIDNEY_UN");


you could have used:

if dxyear ge arv_startyr then do;

  HTN_howmany=count(dxcode, "HTN");

  DYSLIPID_howmany=count(dxcode, "DYSLIPID");

  KIDNEY_UN_howmany=count(dxcode, "KIDNEY_UN");

end;


However, since from your array code you are really interested in only checking whether the entry contains those values, and you're not interested in the individual values, the whole thing could be reduced to something like:


data want;

  set have;

  comorbid_sta=ifn(dxyear ge arv_startyr and dxcode in ('HTN', 'DYSLIPID', 'KIDNEY_UN'),1,0);

run;

View solution in original post

6 REPLIES 6
art297
Opal | Level 21

Just a couple of comments.

Given your small number of subjects/records it won't make a lot of difference, but in both cases you are using a number of redundant calls for dxyear that could be reduced to one statement.  e.g., with your count code, instead of:

if dxyear ge arv_startyr then HTN_howmany=count(dxcode, "HTN");

if dxyear ge arv_startyr then DYSLIPID_howmany=count(dxcode, "DYSLIPID");

if dxyear ge arv_startyr then KIDNEY_UN_howmany=count(dxcode, "KIDNEY_UN");


you could have used:

if dxyear ge arv_startyr then do;

  HTN_howmany=count(dxcode, "HTN");

  DYSLIPID_howmany=count(dxcode, "DYSLIPID");

  KIDNEY_UN_howmany=count(dxcode, "KIDNEY_UN");

end;


However, since from your array code you are really interested in only checking whether the entry contains those values, and you're not interested in the individual values, the whole thing could be reduced to something like:


data want;

  set have;

  comorbid_sta=ifn(dxyear ge arv_startyr and dxcode in ('HTN', 'DYSLIPID', 'KIDNEY_UN'),1,0);

run;

PhanS
Obsidian | Level 7

Dear Sir,

Thanks for your useful comments. Your codes are perfect! I should save my time and energy if I have used your codes at the beginning. 

At the same time, since you have raised this issue --  'you're not interested in the individual values', I wonder if you can recommend syntax that I can deal with the issue.

Sincerely,

Phan S

art297
Opal | Level 21

Not sure what syntax you are referring to.  I was commenting on the fact that you created three how-many fields in the non-array code that you didn't create in the array code.

Do you actually want those fields calculated?

PhanS
Obsidian | Level 7

Dear Sir,

It is correct. I want to calculate separately numbers of subjects who had 'HTN, DYSLIPID, KIDNEY_UN' and total of comorbidity in the cohort prior and after the initial treatment. 

Thanks,

Phan S.

art297
Opal | Level 21

Why not just take care of the separate counts in proc freq?  e.g.,

data want;

  set have;

  comorbid_sta=ifn(dxyear ge arv_startyr and dxcode in ('HTN', 'DYSLIPID', 'KIDNEY_UN'),1,0);

run;

proc freq data=want;

  tables comorbid_sta;

  table comorbid_sta*dxcode;

run;

PhanS
Obsidian | Level 7

Dear Sir,

Certainly, I did use these procedures. Thanks for your clarification.

Before my closing, I would like to thank you again for sharing your thoughts and supports.

Sincerely,

Phan S.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 941 views
  • 0 likes
  • 2 in conversation