I have a data set of patients who have been taking medication for certain amounts of time. I am trying to flag the days 'Y' where they take any drug for >=15 consecutive days but also flag the days as 'N' that don't contribute to those >=15 consecutive days irrespective of medication
data have;
input ID $ medication start_date :date. end_date :date.;
datalines;
A005 1 01SEP2016 01SEP2016
A005 2 02SEP2016 02SEP2016
A005 1 16SEP2016 19SEP2016
A005 3 19SEP2016 30SEP2016
A006 1 01OCT2016 03OCT2016
A006 1 05OCT2016 15OCT2016
A006 1 11OCT2016 19OCT2016
A006 3 31OCT2016 31OCT2016
;
Because the dates are on various lines, I dont know how to flag them when they overlap. Is there anyway I can do this? Perhaps I have to create an intermediate variable?
data want;
input ID $ medication start_date :date. enddate :date. flag;
datalines;
A005 1 01SEP2016 01SEP2016 N
A005 2 02SEP2016 02SEP2016 N
A005 1 16SEP2016 19SEP2016 Y
A005 3 19SEP2016 30SEP2016 Y
A006 1 01OCT2016 03OCT2016 N
A006 1 05OCT2016 15OCT2016 Y
A006 1 11OCT2016 19OCT2016 Y
A006 3 31OCT2016 31OCT2016 N
;
Hello @serena13lee
Try this and let me know
data have;
input ID $ medication start_date :date9. end_date :date9.;
format start_date end_date date9.;
datalines;
A005 1 01SEP2016 01SEP2016
A005 2 02SEP2016 02SEP2016
A005 1 16SEP2016 19SEP2016
A005 3 19SEP2016 30SEP2016
A006 1 01OCT2016 03OCT2016
A006 1 05OCT2016 15OCT2016
A006 1 11OCT2016 19OCT2016
A006 3 31OCT2016 31OCT2016
A007 1 01SEP2016 01SEP2016
A007 2 02SEP2016 03SEP2016
A007 1 03SEP2016 19SEP2016
A007 3 19SEP2016 30SEP2016
;
data temp;
set have;
by id ;
retain e;
if first.id then f=1;
else if start_date-e>1 then f+1;
e=end_date;
drop e;
run;
proc sql;
create table want(drop=f) as
select *, ifc(max(end_date)-min(start_date)+1>=15,'Y','N') as Flag
from temp
group by id,f
order by id, start_date;
quit;
@serena13lee Keeping it simple
data have;
input ID $ medication start_date :date9. end_date :date9.;
format start_date end_date date9.;
datalines;
A005 1 01SEP2016 01SEP2016
A005 2 02SEP2016 02SEP2016
A005 1 16SEP2016 19SEP2016
A005 3 19SEP2016 30SEP2016
A006 1 01OCT2016 03OCT2016
A006 1 05OCT2016 15OCT2016
A006 1 11OCT2016 19OCT2016
A006 3 31OCT2016 31OCT2016
;
data temp;
set have;
by id;
if first.id then f=1;
else if start_date>lag(end_date) then f+1;
run;
proc sql;
create table want(drop=f) as
select *, ifc(max(end_date)-min(start_date)+1>=15,'Y','N') as Flag
from temp
group by id,f
order by id, start_date;
quit;
Hi and thank you so much for your solution. I fitted your code to my data but realized that I didn't capture patients like the following. Patients who ended medication 01SEP2016 then restarted 02SEP2016 should also be part of the consecutive count. Is there anyway I can capture this? Perhaps another lag?
data want;
input ID $ medication start_date :date9. end_date :date9. flag;
format start_date end_date date9.;
datalines;
A007 1 01SEP2016 01SEP2016 Y
A007 2 02SEP2016 03SEP2016 Y
A007 1 03SEP2016 19SEP2016 Y
A007 3 19SEP2016 30SEP2016 Y
;
Hello @serena13lee
Try this and let me know
data have;
input ID $ medication start_date :date9. end_date :date9.;
format start_date end_date date9.;
datalines;
A005 1 01SEP2016 01SEP2016
A005 2 02SEP2016 02SEP2016
A005 1 16SEP2016 19SEP2016
A005 3 19SEP2016 30SEP2016
A006 1 01OCT2016 03OCT2016
A006 1 05OCT2016 15OCT2016
A006 1 11OCT2016 19OCT2016
A006 3 31OCT2016 31OCT2016
A007 1 01SEP2016 01SEP2016
A007 2 02SEP2016 03SEP2016
A007 1 03SEP2016 19SEP2016
A007 3 19SEP2016 30SEP2016
;
data temp;
set have;
by id ;
retain e;
if first.id then f=1;
else if start_date-e>1 then f+1;
e=end_date;
drop e;
run;
proc sql;
create table want(drop=f) as
select *, ifc(max(end_date)-min(start_date)+1>=15,'Y','N') as Flag
from temp
group by id,f
order by id, start_date;
quit;
Thanks again with your quick reply. Your second solution was able to identify those consecutive dates. Both solutions were very clean and robust. Thank you for sharing them with me. I was able to run to successfully get the desired dataset.
Hi @novinosrin ,
Apologies for the follow up question. I just realized that with this code, it returns a note: "The query requires remerging summary statistics back with the original data." I was wondering if there was a way to remove this? After reading this paper https://support.sas.com/resources/papers/proceedings/proceedings/sugi28/103-28.pdf it appears to be either due to the select * or the max().
The remerge was done it chosen on purpose. That's basically is the logic to get the "want". I'm just waking up. I'll message you once I get to work with a fix
Hello @serena13lee
data temp;
set have;
by id ;
retain e;
if first.id then f=1;
else if start_date-e>1 then f+1;
e=end_date;
drop e;
run;
proc sql;
create table want(drop=f) as
select a.*,Flag
from
temp a inner join (select id,f, ifc(max(end_date)-min(start_date)+1>=15,'Y','N') as Flag from temp group by id,f) b
on a.id=b.id and a.f=b.f
order by id, start_date;
quit;
Are you sure that you want this record marked as Y
A005 3 19SEP2016 30SEP2016 Y
Hi @VDD I think yes for the reason "irrespective of medication"
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.