BookmarkSubscribeRSS Feed
monsterpie
Obsidian | Level 7

Hello all,

 

I am trying to create a new variable in my dataset which counts the frequency of unique diagnoses per ID. I want the frequency variable to count any unique diagnosis - except "stress" or "infection". Below is an example of my dataset and desired output. (Note: in reality, my actual dataset has many different types of diagnoses, not just those shown below, but I still do not want to include "stress" or "infection" in my new frequency variable).

Any help would be much appreciated. Thank you.

 

Desired output:

ID visit_num diag age frequency
1 1 diabetes 42 1
2 1 stroke 78 1
2 2 stress    
2 3 stroke    
3 1 infection 61 0
3 2      

 

Data:

data have;
input id visit_num diag$ age;
cards;
01 1 diabetes 42 
02 1 stroke 78 
02 2 stress . 
02 3 stroke . 
03 1 infection 61 
03 2 . . 

;
run;

 

7 REPLIES 7
PaigeMiller
Diamond | Level 26
data have;
input id visit_num diag :$9. age;
cards;
01 1 diabetes 42 
02 1 stroke 78 
02 2 stress . 
02 3 stroke . 
03 1 infection 61 
03 2 . . 
;
run;
proc sort data=have;
    by id diag;
run;
proc freq data=have(where=(diag not in ('stress','infection')));
    tables id*diag/noprint out=_counts_;
run;

data want;
    merge have _counts_(drop=percent rename=(count=frequency));
    by id diag;
run;
proc sort data=want;
    by id visit_num;
run;
--
Paige Miller
tarheel13
Rhodochrosite | Level 12

said not to count stress or infection though. should use a where statement to avoid counting those. 

data have;
input id visit_num diag :$9. age;
cards;
01 1 diabetes 42 
02 1 stroke 78 
02 2 stress . 
02 3 stroke . 
03 1 infection 61 
03 2 . . 
;
run;
proc sort data=have;
    by id diag;
run;
proc freq data=have;
	where diag not in ('stress','infection');
    tables id*diag/noprint out=_counts_;
run;

data want;
    merge have(in=a) _counts_(drop=percent rename=(count=frequency));
    by id diag;
    if a;
run;
proc sort data=want;
    by id visit_num;
run;
PaigeMiller
Diamond | Level 26

@tarheel13 wrote:

said not to count stress or infection though. should use a where statement to avoid counting those. 


Already fixed

--
Paige Miller
monsterpie
Obsidian | Level 7

Thanks @PaigeMiller for your help. I think that almost does what I need it to, but to clarify, I actually want the count of the unique diagnoses per ID, regardless of what the actual diagnosis is, but not counting stress and infection. (E.g., for ID2, the count would be 1, for ID3 the count would be 0).

PaigeMiller
Diamond | Level 26

@monsterpie wrote:

I actually want the count of the unique diagnoses per ID, regardless of what the actual diagnosis is, but not counting stress and infection. (E.g., for ID2, the count would be 1, for ID3 the count would be 0).


This contradicts your stated desired output, where ID2 shows a frequency of 2. So which is it?

--
Paige Miller
monsterpie
Obsidian | Level 7

My apologies @PaigeMiller. I edited the original post to fix this error, I am looking to count the unique diganoses per ID. So for ID2, the count should be 1.

tarheel13
Rhodochrosite | Level 12

are you also not counting strokes? 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1276 views
  • 1 like
  • 3 in conversation