BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
epiflo
Calcite | Level 5

Hello,

I am trying to get some frequency numbers for two patient populations:

 

Pop 1 : Patients who have an initial  behavior code of 0 followed by behavior code 1 based on year of diagnosis (yod).

Pop 2: Patients who have an initial behavior code of 0 followed by behavior code 1 based on diagnosis (yod).

 

Some patients have up to 6 diagnosis but I am only interested in the first 2 as explained above based on the yod.

 

How would you code this given I have multiple diagnosis (behavior_code) and different yod to obtain the n’s for each population?

 

Here is an example of some of the variables I am working with:

 

patient_idageyodbehavior_codedet_sequence_no
155-59 years201101
160-64 years201512
255-59 years200101
255-59 years200212
370-74 years200201
375-79 years200712
445-49 years200301
450-54 years200802
450-54 years200913
460-64 years201814

 

Thank you so much in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

After making your data usable (always, as in ALWAYS, post your example data as data step code which creates the data), I get this:

data have;
infile datalines dlm="09"x dsd truncover;
input patient_id $ age :$15. yod behavior_code det_sequence_no;
datalines;
1	55-59 years	2011	0	1
1	60-64 years	2015	1	2
2	55-59 years	2001	0	1
2	55-59 years	2002	1	2
3	70-74 years	2002	0	1
3	75-79 years	2007	1	2
4	45-49 years	2003	0	1
4	50-54 years	2008	0	2
4	50-54 years	2009	1	3
4	60-64 years	2018	1	4
;

data want;
set have;
by patient_id;
label flag = "Behavior of 2nd Diagnosis";
if first.patient_id
then count = 1;
else count + 1;
beh = lag(behavior_code);
if count = 2;
if not beh
then if behavior_code
  then flag = 1;
  else flag = 0;
keep patient_id flag;
run;

/* The following step, taken from https://support.sas.com/kb/23/350.html,
   is only there to get the variable label for the variable column */
proc template;                                                                
   edit Base.Freq.OneWayList ;             
      edit h1; 
    /* set text to BLANK instead of VARLABEL */ 
         text " ";                                                            
      end;                                                   
                                                                              
      edit FVariable;                                                       
       just=varjust;
       style=rowheader;
       id;
       generic;
       header=varlabel;
      end;
   end;                                    
run;

proc freq data=want;
tables flag / nocum;
run;

which gets this result:

Bildschirmfoto 2023-01-03 um 10.56.35.png

 

View solution in original post

7 REPLIES 7
epiflo
Calcite | Level 5
Hello Kurt-
Thank you for the response, actually, the data I provided is a small set
from the almost 50K observations I have. Some of the patients have multiple
diagnoses of either a 0 or 1. All patients start with a diagnosis of 0 but
then I guess you can say they recur by either developing a 0 or 1 so I am
trying to figure out how to obtain the count for:
number of patients that initially present with a diagnosis of 0 and then
develop another 0
and the
number of patients that initially present with a diagnosis of 0 and then
develop a 1.

Kurt_Bremser
Super User

So you want to count patients with a 0/1 combination in the first two observations vs. those with a 0/0, and disregard all further observations for each patient?

data want;
set have;
by patient_id;
if first.patient_id
then count = 1;
else count + 1;
beh = lag(behavior_code);
if count = 2;
if not beh
then if behavior_code
  then flag = 1;
  else flag = 0;
keep patient_id flag;
run;

Should a patient start with 1, flag will be missing.

epiflo
Calcite | Level 5

Thank you so much, Kurt and happy new year.  I tried your suggested code but it gave me the below. 

 

epiflo_0-1672702527100.png

 

.

I think my problem is getting the frequency number for the second diagnosis when some patients have 3 or more diagnosis. I am simply interested in the initial 0 and the second using the diagnosis year.. As I mentioned the first diagnosis for all patients is a 0 and the second diagnosis will either will be a 0 or 1.

epiflo_1-1672702527100.png

 

How can I generate the following table? For example this is the ideal output I am looking for with made-up numbers:

 

 

Behavior of 2nd diagnosis

Frequency

Percent

 

 

 

0

211

44.99

1

258

55.01

 

 

 

Thank you again for your help. I hope this clarifies further.. 

Kurt_Bremser
Super User

After making your data usable (always, as in ALWAYS, post your example data as data step code which creates the data), I get this:

data have;
infile datalines dlm="09"x dsd truncover;
input patient_id $ age :$15. yod behavior_code det_sequence_no;
datalines;
1	55-59 years	2011	0	1
1	60-64 years	2015	1	2
2	55-59 years	2001	0	1
2	55-59 years	2002	1	2
3	70-74 years	2002	0	1
3	75-79 years	2007	1	2
4	45-49 years	2003	0	1
4	50-54 years	2008	0	2
4	50-54 years	2009	1	3
4	60-64 years	2018	1	4
;

data want;
set have;
by patient_id;
label flag = "Behavior of 2nd Diagnosis";
if first.patient_id
then count = 1;
else count + 1;
beh = lag(behavior_code);
if count = 2;
if not beh
then if behavior_code
  then flag = 1;
  else flag = 0;
keep patient_id flag;
run;

/* The following step, taken from https://support.sas.com/kb/23/350.html,
   is only there to get the variable label for the variable column */
proc template;                                                                
   edit Base.Freq.OneWayList ;             
      edit h1; 
    /* set text to BLANK instead of VARLABEL */ 
         text " ";                                                            
      end;                                                   
                                                                              
      edit FVariable;                                                       
       just=varjust;
       style=rowheader;
       id;
       generic;
       header=varlabel;
      end;
   end;                                    
run;

proc freq data=want;
tables flag / nocum;
run;

which gets this result:

Bildschirmfoto 2023-01-03 um 10.56.35.png

 

epiflo
Calcite | Level 5
Thank you, Kurt. This worked!
andreas_lds
Jade | Level 19

I don't understand the difference between Pop1 and Pop2. If the code posted by @Kurt_Bremser  does not solve the issue, you need to post example data containing all possible combinations and show the expected result for that dataset.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 998 views
  • 1 like
  • 3 in conversation