I am trying to find the first date of the latest regimen in a group of treatments. I'm having a hard time wrapping my head around this. I have a feeling PROC SQL with a HAVING statement might be my best bet, but I'm not able to think of how to do this.
Suppose I have several subjects who have received various regimens of therapy where each regimen can contain multiple therapies. I want to find the latest date for each subject. Then, I need to find the earliest date in that corresponding regimen.
So, in the data below, Subject 101-001 has a latest date of 04JUL2014, which is Regimen #2. The earliest date in Regimen #2 is 10JUN2014. That would be the record I want to keep. Likewise, Subject 101-002 has a latest date of 21DEC2016, which is in Regimen #2. The earliest date in Regimen #2 is 15NOV2016. That would be the record I would want to keep.
I can't make any assumptions like regimens being in chronological order. It could very well happen that Regimen #2 happens a couple years before Regimen #1. Regimens are the order they are entered more so than when they occurred.
Subject Regimen Date
101-001 | 1 | 12-Dec-2013 |
101-001 | 1 | 21-Dec-2013 |
101-001 | 1 | 5-Jan-2014 |
101-001 | 2 | 10-Jun-2014 |
101-001 | 2 | 19-Jun-2014 |
101-001 | 2 | 4-Jul-2014 |
101-002 | 1 | 27-Sep-2016 |
101-002 | 1 | 1-Oct-2016 |
101-002 | 2 | 15-Nov-2016 |
101-002 | 2 | 21-Dec-2016 |
101-002 | 3 | 20-Nov-2016 |
101-002 | 3 | 1-Dec-2016 |
data have;
input Subject $ Regimen Date :date9.;
format date date9.;
cards;
101-001 1 12-Dec-2013
101-001 1 21-Dec-2013
101-001 1 5-Jan-2014
101-001 2 10-Jun-2014
101-001 2 19-Jun-2014
101-001 2 4-Jul-2014
101-002 1 27-Sep-2016
101-002 1 1-Oct-2016
101-002 2 15-Nov-2016
101-002 2 21-Dec-2016
101-002 3 20-Nov-2016
101-002 3 1-Dec-2016
;
proc sql;
create table want as
select a.*
from have a, (select subject,regimen as _regimen from have group by subject having date=max(date)) b
where a.subject=b.subject
group by a.subject,regimen,_regimen
having date=min(date) and _regimen=regimen;
quit;
data have;
input Subject $ Regimen Date :date9.;
format date date9.;
cards;
101-001 1 12-Dec-2013
101-001 1 21-Dec-2013
101-001 1 5-Jan-2014
101-001 2 10-Jun-2014
101-001 2 19-Jun-2014
101-001 2 4-Jul-2014
101-002 1 27-Sep-2016
101-002 1 1-Oct-2016
101-002 2 15-Nov-2016
101-002 2 21-Dec-2016
101-002 3 20-Nov-2016
101-002 3 1-Dec-2016
;
proc sql;
create table want as
select a.*
from have a, (select subject,regimen as _regimen from have group by subject having date=max(date)) b
where a.subject=b.subject
group by a.subject,regimen,_regimen
having date=min(date) and _regimen=regimen;
quit;
If you would rather not use SQL:
proc sort data=have; by subject regimen date; run;
proc means data=have noprint;
by subject;
output out=temp(drop=_: ) maxid(date(regimen))= / autoname;
run;
data want;
merge have temp(in=ok); by subject regimen;
if ok and first.regimen then output;
run;
If you only want the date (and not other variables from the first observation in the regimen), it can be done in a single datastep like this:
data want;
do until(last.subject);
set have;
by subject regimen;
if first.regimen then
first_date=date;
end;
date=first_date;
drop first_date;
run;
data have;
input Subject $ Regimen Date :date9.;
format date date9.;
cards;
101-001 1 12-Dec-2013
101-001 1 21-Dec-2013
101-001 1 5-Jan-2014
101-001 2 10-Jun-2014
101-001 2 19-Jun-2014
101-001 2 4-Jul-2014
101-002 1 27-Sep-2016
101-002 1 1-Oct-2016
101-002 2 15-Nov-2016
101-002 2 21-Dec-2016
101-002 3 20-Nov-2016
101-002 3 1-Dec-2016
;
data temp;
set have;
by Subject Regimen notsorted;
group+first.Regimen;
run;
proc sort data=temp;
by Subject group descending date;
run;
data want;
set temp;
by Subject;
if last.Subject;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.