Hello,
I am trying to get some frequency numbers for two patient populations:
Pop 1 : Patients who have an initial behavior code of 0 followed by behavior code 1 based on year of diagnosis (yod).
Pop 2: Patients who have an initial behavior code of 0 followed by behavior code 1 based on diagnosis (yod).
Some patients have up to 6 diagnosis but I am only interested in the first 2 as explained above based on the yod.
How would you code this given I have multiple diagnosis (behavior_code) and different yod to obtain the n’s for each population?
Here is an example of some of the variables I am working with:
patient_id | age | yod | behavior_code | det_sequence_no |
1 | 55-59 years | 2011 | 0 | 1 |
1 | 60-64 years | 2015 | 1 | 2 |
2 | 55-59 years | 2001 | 0 | 1 |
2 | 55-59 years | 2002 | 1 | 2 |
3 | 70-74 years | 2002 | 0 | 1 |
3 | 75-79 years | 2007 | 1 | 2 |
4 | 45-49 years | 2003 | 0 | 1 |
4 | 50-54 years | 2008 | 0 | 2 |
4 | 50-54 years | 2009 | 1 | 3 |
4 | 60-64 years | 2018 | 1 | 4 |
Thank you so much in advance!
After making your data usable (always, as in ALWAYS, post your example data as data step code which creates the data), I get this:
data have;
infile datalines dlm="09"x dsd truncover;
input patient_id $ age :$15. yod behavior_code det_sequence_no;
datalines;
1 55-59 years 2011 0 1
1 60-64 years 2015 1 2
2 55-59 years 2001 0 1
2 55-59 years 2002 1 2
3 70-74 years 2002 0 1
3 75-79 years 2007 1 2
4 45-49 years 2003 0 1
4 50-54 years 2008 0 2
4 50-54 years 2009 1 3
4 60-64 years 2018 1 4
;
data want;
set have;
by patient_id;
label flag = "Behavior of 2nd Diagnosis";
if first.patient_id
then count = 1;
else count + 1;
beh = lag(behavior_code);
if count = 2;
if not beh
then if behavior_code
then flag = 1;
else flag = 0;
keep patient_id flag;
run;
/* The following step, taken from https://support.sas.com/kb/23/350.html,
is only there to get the variable label for the variable column */
proc template;
edit Base.Freq.OneWayList ;
edit h1;
/* set text to BLANK instead of VARLABEL */
text " ";
end;
edit FVariable;
just=varjust;
style=rowheader;
id;
generic;
header=varlabel;
end;
end;
run;
proc freq data=want;
tables flag / nocum;
run;
which gets this result:
All your patients have a 0 followed by a 1, so they all fall into the same group.
So you want to count patients with a 0/1 combination in the first two observations vs. those with a 0/0, and disregard all further observations for each patient?
data want;
set have;
by patient_id;
if first.patient_id
then count = 1;
else count + 1;
beh = lag(behavior_code);
if count = 2;
if not beh
then if behavior_code
then flag = 1;
else flag = 0;
keep patient_id flag;
run;
Should a patient start with 1, flag will be missing.
Thank you so much, Kurt and happy new year. I tried your suggested code but it gave me the below.
.
I think my problem is getting the frequency number for the second diagnosis when some patients have 3 or more diagnosis. I am simply interested in the initial 0 and the second using the diagnosis year.. As I mentioned the first diagnosis for all patients is a 0 and the second diagnosis will either will be a 0 or 1.
How can I generate the following table? For example this is the ideal output I am looking for with made-up numbers:
Behavior of 2nd diagnosis | Frequency | Percent |
|
|
|
0 | 211 | 44.99 |
1 | 258 | 55.01 |
Thank you again for your help. I hope this clarifies further..
After making your data usable (always, as in ALWAYS, post your example data as data step code which creates the data), I get this:
data have;
infile datalines dlm="09"x dsd truncover;
input patient_id $ age :$15. yod behavior_code det_sequence_no;
datalines;
1 55-59 years 2011 0 1
1 60-64 years 2015 1 2
2 55-59 years 2001 0 1
2 55-59 years 2002 1 2
3 70-74 years 2002 0 1
3 75-79 years 2007 1 2
4 45-49 years 2003 0 1
4 50-54 years 2008 0 2
4 50-54 years 2009 1 3
4 60-64 years 2018 1 4
;
data want;
set have;
by patient_id;
label flag = "Behavior of 2nd Diagnosis";
if first.patient_id
then count = 1;
else count + 1;
beh = lag(behavior_code);
if count = 2;
if not beh
then if behavior_code
then flag = 1;
else flag = 0;
keep patient_id flag;
run;
/* The following step, taken from https://support.sas.com/kb/23/350.html,
is only there to get the variable label for the variable column */
proc template;
edit Base.Freq.OneWayList ;
edit h1;
/* set text to BLANK instead of VARLABEL */
text " ";
end;
edit FVariable;
just=varjust;
style=rowheader;
id;
generic;
header=varlabel;
end;
end;
run;
proc freq data=want;
tables flag / nocum;
run;
which gets this result:
I don't understand the difference between Pop1 and Pop2. If the code posted by @Kurt_Bremser does not solve the issue, you need to post example data containing all possible combinations and show the expected result for that dataset.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.