BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
epiflo
Calcite | Level 5

Hello,

I am trying to get some frequency numbers for two patient populations:

 

Pop 1 : Patients who have an initial  behavior code of 0 followed by behavior code 1 based on year of diagnosis (yod).

Pop 2: Patients who have an initial behavior code of 0 followed by behavior code 1 based on diagnosis (yod).

 

Some patients have up to 6 diagnosis but I am only interested in the first 2 as explained above based on the yod.

 

How would you code this given I have multiple diagnosis (behavior_code) and different yod to obtain the n’s for each population?

 

Here is an example of some of the variables I am working with:

 

patient_idageyodbehavior_codedet_sequence_no
155-59 years201101
160-64 years201512
255-59 years200101
255-59 years200212
370-74 years200201
375-79 years200712
445-49 years200301
450-54 years200802
450-54 years200913
460-64 years201814

 

Thank you so much in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

After making your data usable (always, as in ALWAYS, post your example data as data step code which creates the data), I get this:

data have;
infile datalines dlm="09"x dsd truncover;
input patient_id $ age :$15. yod behavior_code det_sequence_no;
datalines;
1	55-59 years	2011	0	1
1	60-64 years	2015	1	2
2	55-59 years	2001	0	1
2	55-59 years	2002	1	2
3	70-74 years	2002	0	1
3	75-79 years	2007	1	2
4	45-49 years	2003	0	1
4	50-54 years	2008	0	2
4	50-54 years	2009	1	3
4	60-64 years	2018	1	4
;

data want;
set have;
by patient_id;
label flag = "Behavior of 2nd Diagnosis";
if first.patient_id
then count = 1;
else count + 1;
beh = lag(behavior_code);
if count = 2;
if not beh
then if behavior_code
  then flag = 1;
  else flag = 0;
keep patient_id flag;
run;

/* The following step, taken from https://support.sas.com/kb/23/350.html,
   is only there to get the variable label for the variable column */
proc template;                                                                
   edit Base.Freq.OneWayList ;             
      edit h1; 
    /* set text to BLANK instead of VARLABEL */ 
         text " ";                                                            
      end;                                                   
                                                                              
      edit FVariable;                                                       
       just=varjust;
       style=rowheader;
       id;
       generic;
       header=varlabel;
      end;
   end;                                    
run;

proc freq data=want;
tables flag / nocum;
run;

which gets this result:

Bildschirmfoto 2023-01-03 um 10.56.35.png

 

View solution in original post

7 REPLIES 7
epiflo
Calcite | Level 5
Hello Kurt-
Thank you for the response, actually, the data I provided is a small set
from the almost 50K observations I have. Some of the patients have multiple
diagnoses of either a 0 or 1. All patients start with a diagnosis of 0 but
then I guess you can say they recur by either developing a 0 or 1 so I am
trying to figure out how to obtain the count for:
number of patients that initially present with a diagnosis of 0 and then
develop another 0
and the
number of patients that initially present with a diagnosis of 0 and then
develop a 1.

Kurt_Bremser
Super User

So you want to count patients with a 0/1 combination in the first two observations vs. those with a 0/0, and disregard all further observations for each patient?

data want;
set have;
by patient_id;
if first.patient_id
then count = 1;
else count + 1;
beh = lag(behavior_code);
if count = 2;
if not beh
then if behavior_code
  then flag = 1;
  else flag = 0;
keep patient_id flag;
run;

Should a patient start with 1, flag will be missing.

epiflo
Calcite | Level 5

Thank you so much, Kurt and happy new year.  I tried your suggested code but it gave me the below. 

 

epiflo_0-1672702527100.png

 

.

I think my problem is getting the frequency number for the second diagnosis when some patients have 3 or more diagnosis. I am simply interested in the initial 0 and the second using the diagnosis year.. As I mentioned the first diagnosis for all patients is a 0 and the second diagnosis will either will be a 0 or 1.

epiflo_1-1672702527100.png

 

How can I generate the following table? For example this is the ideal output I am looking for with made-up numbers:

 

 

Behavior of 2nd diagnosis

Frequency

Percent

 

 

 

0

211

44.99

1

258

55.01

 

 

 

Thank you again for your help. I hope this clarifies further.. 

Kurt_Bremser
Super User

After making your data usable (always, as in ALWAYS, post your example data as data step code which creates the data), I get this:

data have;
infile datalines dlm="09"x dsd truncover;
input patient_id $ age :$15. yod behavior_code det_sequence_no;
datalines;
1	55-59 years	2011	0	1
1	60-64 years	2015	1	2
2	55-59 years	2001	0	1
2	55-59 years	2002	1	2
3	70-74 years	2002	0	1
3	75-79 years	2007	1	2
4	45-49 years	2003	0	1
4	50-54 years	2008	0	2
4	50-54 years	2009	1	3
4	60-64 years	2018	1	4
;

data want;
set have;
by patient_id;
label flag = "Behavior of 2nd Diagnosis";
if first.patient_id
then count = 1;
else count + 1;
beh = lag(behavior_code);
if count = 2;
if not beh
then if behavior_code
  then flag = 1;
  else flag = 0;
keep patient_id flag;
run;

/* The following step, taken from https://support.sas.com/kb/23/350.html,
   is only there to get the variable label for the variable column */
proc template;                                                                
   edit Base.Freq.OneWayList ;             
      edit h1; 
    /* set text to BLANK instead of VARLABEL */ 
         text " ";                                                            
      end;                                                   
                                                                              
      edit FVariable;                                                       
       just=varjust;
       style=rowheader;
       id;
       generic;
       header=varlabel;
      end;
   end;                                    
run;

proc freq data=want;
tables flag / nocum;
run;

which gets this result:

Bildschirmfoto 2023-01-03 um 10.56.35.png

 

epiflo
Calcite | Level 5
Thank you, Kurt. This worked!
andreas_lds
Jade | Level 19

I don't understand the difference between Pop1 and Pop2. If the code posted by @Kurt_Bremser  does not solve the issue, you need to post example data containing all possible combinations and show the expected result for that dataset.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1666 views
  • 1 like
  • 3 in conversation