I have a data set of patients who have been taking medication for certain amounts of time. I am trying to flag the days 'Y' where they take any drug for >=15 consecutive days but also flag the days as 'N' that don't contribute to those >=15 consecutive days irrespective of medication
data have;
input ID $ medication start_date :date. end_date :date.;
datalines;
A005 1 01SEP2016 01SEP2016
A005 2 02SEP2016 02SEP2016
A005 1 16SEP2016 19SEP2016
A005 3 19SEP2016 30SEP2016
A006 1 01OCT2016 03OCT2016
A006 1 05OCT2016 15OCT2016
A006 1 11OCT2016 19OCT2016
A006 3 31OCT2016 31OCT2016
;
Because the dates are on various lines, I dont know how to flag them when they overlap. Is there anyway I can do this? Perhaps I have to create an intermediate variable?
data want;
input ID $ medication start_date :date. enddate :date. flag;
datalines;
A005 1 01SEP2016 01SEP2016 N
A005 2 02SEP2016 02SEP2016 N
A005 1 16SEP2016 19SEP2016 Y
A005 3 19SEP2016 30SEP2016 Y
A006 1 01OCT2016 03OCT2016 N
A006 1 05OCT2016 15OCT2016 Y
A006 1 11OCT2016 19OCT2016 Y
A006 3 31OCT2016 31OCT2016 N
;
Hello @serena13lee
Try this and let me know
data have;
input ID $ medication start_date :date9. end_date :date9.;
format start_date end_date date9.;
datalines;
A005 1 01SEP2016 01SEP2016
A005 2 02SEP2016 02SEP2016
A005 1 16SEP2016 19SEP2016
A005 3 19SEP2016 30SEP2016
A006 1 01OCT2016 03OCT2016
A006 1 05OCT2016 15OCT2016
A006 1 11OCT2016 19OCT2016
A006 3 31OCT2016 31OCT2016
A007 1 01SEP2016 01SEP2016
A007 2 02SEP2016 03SEP2016
A007 1 03SEP2016 19SEP2016
A007 3 19SEP2016 30SEP2016
;
data temp;
set have;
by id ;
retain e;
if first.id then f=1;
else if start_date-e>1 then f+1;
e=end_date;
drop e;
run;
proc sql;
create table want(drop=f) as
select *, ifc(max(end_date)-min(start_date)+1>=15,'Y','N') as Flag
from temp
group by id,f
order by id, start_date;
quit;
@serena13lee Keeping it simple
data have;
input ID $ medication start_date :date9. end_date :date9.;
format start_date end_date date9.;
datalines;
A005 1 01SEP2016 01SEP2016
A005 2 02SEP2016 02SEP2016
A005 1 16SEP2016 19SEP2016
A005 3 19SEP2016 30SEP2016
A006 1 01OCT2016 03OCT2016
A006 1 05OCT2016 15OCT2016
A006 1 11OCT2016 19OCT2016
A006 3 31OCT2016 31OCT2016
;
data temp;
set have;
by id;
if first.id then f=1;
else if start_date>lag(end_date) then f+1;
run;
proc sql;
create table want(drop=f) as
select *, ifc(max(end_date)-min(start_date)+1>=15,'Y','N') as Flag
from temp
group by id,f
order by id, start_date;
quit;
Hi and thank you so much for your solution. I fitted your code to my data but realized that I didn't capture patients like the following. Patients who ended medication 01SEP2016 then restarted 02SEP2016 should also be part of the consecutive count. Is there anyway I can capture this? Perhaps another lag?
data want;
input ID $ medication start_date :date9. end_date :date9. flag;
format start_date end_date date9.;
datalines;
A007 1 01SEP2016 01SEP2016 Y
A007 2 02SEP2016 03SEP2016 Y
A007 1 03SEP2016 19SEP2016 Y
A007 3 19SEP2016 30SEP2016 Y
;
Hello @serena13lee
Try this and let me know
data have;
input ID $ medication start_date :date9. end_date :date9.;
format start_date end_date date9.;
datalines;
A005 1 01SEP2016 01SEP2016
A005 2 02SEP2016 02SEP2016
A005 1 16SEP2016 19SEP2016
A005 3 19SEP2016 30SEP2016
A006 1 01OCT2016 03OCT2016
A006 1 05OCT2016 15OCT2016
A006 1 11OCT2016 19OCT2016
A006 3 31OCT2016 31OCT2016
A007 1 01SEP2016 01SEP2016
A007 2 02SEP2016 03SEP2016
A007 1 03SEP2016 19SEP2016
A007 3 19SEP2016 30SEP2016
;
data temp;
set have;
by id ;
retain e;
if first.id then f=1;
else if start_date-e>1 then f+1;
e=end_date;
drop e;
run;
proc sql;
create table want(drop=f) as
select *, ifc(max(end_date)-min(start_date)+1>=15,'Y','N') as Flag
from temp
group by id,f
order by id, start_date;
quit;
Thanks again with your quick reply. Your second solution was able to identify those consecutive dates. Both solutions were very clean and robust. Thank you for sharing them with me. I was able to run to successfully get the desired dataset.
Hi @novinosrin ,
Apologies for the follow up question. I just realized that with this code, it returns a note: "The query requires remerging summary statistics back with the original data." I was wondering if there was a way to remove this? After reading this paper https://support.sas.com/resources/papers/proceedings/proceedings/sugi28/103-28.pdf it appears to be either due to the select * or the max().
The remerge was done it chosen on purpose. That's basically is the logic to get the "want". I'm just waking up. I'll message you once I get to work with a fix
Hello @serena13lee
data temp;
set have;
by id ;
retain e;
if first.id then f=1;
else if start_date-e>1 then f+1;
e=end_date;
drop e;
run;
proc sql;
create table want(drop=f) as
select a.*,Flag
from
temp a inner join (select id,f, ifc(max(end_date)-min(start_date)+1>=15,'Y','N') as Flag from temp group by id,f) b
on a.id=b.id and a.f=b.f
order by id, start_date;
quit;
Are you sure that you want this record marked as Y
A005 3 19SEP2016 30SEP2016 Y
Hi @VDD I think yes for the reason "irrespective of medication"
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.