Hello all!
I have a longitudinal dataset with repeated measures (healthcare encounter dates) for individuals. The individuals all have different "start dates" (ie, first healthcare encounter date), and they can have differing number of total encounters. I need to a criteria that individuals in my study have at least one healthcare encounter every 2 years between their start (first healthcare encounter date) and end date (last healthcare encounter date). I was thinking about doing this by creating arrays that are 2 years long based on each person's start date, and then "flagging" if an individual has at least one encounter within each array. Any advice/resources on how to go about this? Thank you in advance!
data have;
input id date;
cards;
1 19SEP2005
1 09NOV2007
1 11JAN2008
1 14FEB2009
1 15JUN2011
2 13DEC2011
2 15JUN2012
2 17NOV2012
2 17SEP2013
2 20JAN2015
3 18JAN2003
3 22MAR2005
;
I have tested the code with the sample data without any errors.
If you are trying to run the code with your actual data, you would need to make some modifications.
The number of array elements will be determined dynamically by the code below.
array dt[*] date1-date&max_ct.; array time[*] time1-time&max_ct.;
See if this works, post the log if you get errrors.
proc sql noprint;
select strip(put(count(date),4.)) into :max_ct
from have
group by id;
quit;
%put *&max_ct.*;
data want;
array dt[*] date1-date&max_ct.;
array time[*] time1-time&max_ct.;
format date: first_enc_date date9.;
do i=1 by 1 until(last.id);
set have;
by id date;
dt[i]=date;
if first.id then do;
first_enc_date=date;
end;
end;
do i=1 to i;
set have;
do j = 1 to dim(dt)-1;
if dt[j] > 0 then time[j]=(dt[j+1] <= intnx('year',dt[j],2,'same'));
end;
output;
end;
drop i j date1-date&max_ct.;
run;
1. Show us the desired output.
2. "The individuals all have different "start dates" (ie, first healthcare encounter date)," - Is the start date same as the minimum date for the id ?
3. "I need to a criteria that individuals in my study have at least one healthcare encounter every 2 years between their start (first healthcare encounter date) and end date (last healthcare encounter date)" - Should the next encounter be exactly at 2 years ? Does an encounter between 1 and 2 years qualify ?
Based on my understanding :
data have;
input id date : date9.;
format date date9.;
cards;
1 19SEP2005
1 09NOV2007
1 11JAN2008
1 14FEB2009
1 15JUN2011
2 13DEC2011
2 15JUN2012
2 17NOV2012
2 17SEP2013
2 20JAN2015
3 18JAN2003
3 22MAR2005
;
data want;
set have;
retain first_enc_date flag;
format first_enc_date date9.;
by id date;
if first.id then do;
flag=.;
first_enc_date=date;
end;
else do;
flag=(intck('year', first_enc_date, date) <=2);
end;
run;
2. Yes, the start date is the minimum for date for the id
3. An encounter does not have to be exactly at 2 years. It just needs to be within the 2 year time frame
1. Sorry, I should have included the desired output. In addition to the "flag" and "first_enc_date" columns from your code, I would like time1-timei columns that indicate if, during the 2 year period, the individual had at least one encounter. So for id=1 below, the time-based variables would be created as noted below. 0 means the individual does not have at least one encounter during the timeframe. 1 means the individual has at least one encounter.
id | date | flag | first_enc_date | time1 | time2 | time3 | time4 | time5 | time6 |
1 | 19SEP2005 | . | 19SEP2005 | 1 | 1 | 0 | 0 | 1 | 1 |
1 | 09NOV2007 | . | 19SEP2005 | 1 | 1 | 0 | 0 | 1 | 1 |
1 | 11JAN2008 | 1 | 19SEP2005 | 1 | 1 | 0 | 0 | 1 | 1 |
1 | 14FEB2009 | 1 | 19SEP2005 | 1 | 1 | 0 | 0 | 1 | 1 |
1 | 15JUN2015 | . | 19SEP2005 | 1 | 1 | 0 | 0 | 1 | 1 |
1 | 15JUN2016 | 1 | 19SEP2005 | 1 | 1 | 0 | 0 | 1 | 1 |
2 | 13DEC2011 | 1 | 13DEC2011 | 1 | 1 | ||||
2 | 15JUN2012 | 1 | 13DEC2011 | 1 | 1 | ||||
2 | 17NOV2012 | 1 | 13DEC2011 | 1 | 1 | ||||
2 | 17SEP2013 | 1 | 13DEC2011 | 1 | 1 | ||||
2 | 20JAN2015 | 1 | 13DEC2011 | 1 | 1 | ||||
3 | 18JAN2003 | . | 18JAN2003 | 1 | 1 | ||||
3 | 22MAR2005 | . | 18JAN2003 | 1 | 1 |
Please note that I modified the original data for studyid=1. Here's the code for the output that I listed above:
data have;
input id date : date9.;
format date date9.;
cards;
1 19SEP2005
1 09NOV2007
1 11JAN2008
1 14FEB2009
1 15JUN2015
1 15JUN2016
2 13DEC2011
2 15JUN2012
2 17NOV2012
2 17SEP2013
2 20JAN2015
3 18JAN2003
3 22MAR2005
;
Thank you for your time and help @r_behata !
Not sure if the sample output that you have provided is representative of the desired output.
for instance , ID 1 ; 09NOV2007 - 19SEP2005 > 2 Years but your sample flags this as 1.
For what its worth, I took a stab at it :
data have;
input id date : date9.;
format date date9.;
cards;
1 19SEP2005
1 09NOV2007
1 11JAN2008
1 14FEB2009
1 15JUN2015
1 15JUN2016
2 13DEC2011
2 15JUN2012
2 17NOV2012
2 17SEP2013
2 20JAN2015
3 18JAN2003
3 22MAR2005
;
run;
data want;
array dt[*] date1-date6;
array time[*] time1-time6;
format date: first_enc_date date9.;
do i=1 by 1 until(last.id);
set have;
by id date;
dt[i]=date;
if first.id then do;
first_enc_date=date;
end;
end;
do i=1 to i;
set have;
do j = 1 to dim(dt)-1;
time[j]=(dt[j+1] <= intnx('year',dt[j],2,'same'));
end;
output;
end;
drop i j date1-date6;
run;
Thank you! Yes, you're correct, that was an error on my part. I'm getting an error message, "Array subscript out of range" for the "dt[i]=date" portion of the code. Any suggestions on how I can troubleshoot this?
I have tested the code with the sample data without any errors.
If you are trying to run the code with your actual data, you would need to make some modifications.
The number of array elements will be determined dynamically by the code below.
array dt[*] date1-date&max_ct.; array time[*] time1-time&max_ct.;
See if this works, post the log if you get errrors.
proc sql noprint;
select strip(put(count(date),4.)) into :max_ct
from have
group by id;
quit;
%put *&max_ct.*;
data want;
array dt[*] date1-date&max_ct.;
array time[*] time1-time&max_ct.;
format date: first_enc_date date9.;
do i=1 by 1 until(last.id);
set have;
by id date;
dt[i]=date;
if first.id then do;
first_enc_date=date;
end;
end;
do i=1 to i;
set have;
do j = 1 to dim(dt)-1;
if dt[j] > 0 then time[j]=(dt[j+1] <= intnx('year',dt[j],2,'same'));
end;
output;
end;
drop i j date1-date&max_ct.;
run;
Thank you for your response! Yes, I think this is working now.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.