BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
serena13lee
Quartz | Level 8

I have a data set of patients who have been taking medication for certain amounts of time. I am trying to flag the days 'Y' where they take any drug for >=15 consecutive days but also flag the days as 'N' that don't contribute to those >=15 consecutive days irrespective of medication

 

data have;
    input ID $ medication start_date :date. end_date :date.;
    datalines;
A005 1 01SEP2016 01SEP2016
A005 2 02SEP2016 02SEP2016
A005 1 16SEP2016 19SEP2016
A005 3 19SEP2016 30SEP2016 
A006 1 01OCT2016 03OCT2016 A006 1 05OCT2016 15OCT2016 A006 1 11OCT2016 19OCT2016 A006 3 31OCT2016 31OCT2016 ;

Because the dates are on various lines, I dont know how to flag them when they overlap. Is there anyway I can do this? Perhaps I have to create an intermediate variable? 

 

data want;
    input ID $ medication	start_date :date. enddate :date. flag;
    datalines;
A005 1 01SEP2016 01SEP2016 N
A005 2 02SEP2016 02SEP2016 N
A005 1 16SEP2016 19SEP2016 Y
A005 3 19SEP2016 30SEP2016 Y
A006 1 01OCT2016 03OCT2016 N 
A006 1 05OCT2016 15OCT2016 Y
A006 1 11OCT2016 19OCT2016 Y
A006 3 31OCT2016 31OCT2016 N
;
1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hello @serena13lee 

 

Try this and let me know

 

data have;
    input ID $ medication start_date :date9. end_date :date9.;
	format 	start_date end_date date9.;
    datalines;
A005 1 01SEP2016 01SEP2016
A005 2 02SEP2016 02SEP2016
A005 1 16SEP2016 19SEP2016
A005 3 19SEP2016 30SEP2016
A006 1 01OCT2016 03OCT2016
A006 1 05OCT2016 15OCT2016
A006 1 11OCT2016 19OCT2016
A006 3 31OCT2016 31OCT2016
A007 1 01SEP2016 01SEP2016
A007 2 02SEP2016 03SEP2016
A007 1 03SEP2016 19SEP2016
A007 3 19SEP2016 30SEP2016
;



data temp;
set have;
by id ;
retain e;
if first.id then  f=1;
else if start_date-e>1 then f+1;
e=end_date;
drop e;
run;



proc sql;
create table want(drop=f) as
select *, ifc(max(end_date)-min(start_date)+1>=15,'Y','N') as Flag
from temp
group by id,f
order by id, start_date;
quit;

View solution in original post

11 REPLIES 11
Reeza
Super User
How big are you data sets and how many medications are you dealing with here? One common approach that drastically simplifies these analysis is to create a record for each date and then indicate which drugs are present on each date. But that can make your data set very long very quickly so for larger datasets a more succint approach is desired.
novinosrin
Tourmaline | Level 20

@serena13lee  Keeping it simple

 

data have;
    input ID $ medication start_date :date9. end_date :date9.;
	format 	start_date end_date date9.;
    datalines;
A005 1 01SEP2016 01SEP2016
A005 2 02SEP2016 02SEP2016
A005 1 16SEP2016 19SEP2016
A005 3 19SEP2016 30SEP2016
A006 1 01OCT2016 03OCT2016
A006 1 05OCT2016 15OCT2016
A006 1 11OCT2016 19OCT2016
A006 3 31OCT2016 31OCT2016
;


data temp;
set have;
by id;
if first.id then  f=1;
else if start_date>lag(end_date) then	f+1;
run;


proc sql;
create table want(drop=f) as
select *, ifc(max(end_date)-min(start_date)+1>=15,'Y','N') as Flag
from temp
group by id,f
order by id, start_date;
quit;
serena13lee
Quartz | Level 8

Hi and thank you so much for your solution. I fitted your code to my data but realized that I didn't capture patients like the following. Patients who ended medication 01SEP2016 then restarted 02SEP2016 should also be part of the consecutive count. Is there anyway I can capture this? Perhaps another lag? 

 

data want;
    input ID $ medication start_date :date9. end_date :date9. flag;
	format 	start_date end_date date9.;
    datalines;
A007 1 01SEP2016 01SEP2016 Y
A007 2 02SEP2016 03SEP2016 Y
A007 1 03SEP2016 19SEP2016 Y
A007 3 19SEP2016 30SEP2016 Y
; 
novinosrin
Tourmaline | Level 20

Hello @serena13lee 

 

Try this and let me know

 

data have;
    input ID $ medication start_date :date9. end_date :date9.;
	format 	start_date end_date date9.;
    datalines;
A005 1 01SEP2016 01SEP2016
A005 2 02SEP2016 02SEP2016
A005 1 16SEP2016 19SEP2016
A005 3 19SEP2016 30SEP2016
A006 1 01OCT2016 03OCT2016
A006 1 05OCT2016 15OCT2016
A006 1 11OCT2016 19OCT2016
A006 3 31OCT2016 31OCT2016
A007 1 01SEP2016 01SEP2016
A007 2 02SEP2016 03SEP2016
A007 1 03SEP2016 19SEP2016
A007 3 19SEP2016 30SEP2016
;



data temp;
set have;
by id ;
retain e;
if first.id then  f=1;
else if start_date-e>1 then f+1;
e=end_date;
drop e;
run;



proc sql;
create table want(drop=f) as
select *, ifc(max(end_date)-min(start_date)+1>=15,'Y','N') as Flag
from temp
group by id,f
order by id, start_date;
quit;
serena13lee
Quartz | Level 8

Thanks again with your quick reply. Your second solution was able to identify those consecutive dates. Both solutions were very clean and robust. Thank you for sharing them with me. I was able to run to successfully get the desired dataset. 

serena13lee
Quartz | Level 8

Hi @novinosrin , 

 

Apologies for the follow up question. I just realized that with this code, it returns a note: "The query requires remerging summary statistics back with the original data." I was wondering if there was a way to remove this? After reading this paper https://support.sas.com/resources/papers/proceedings/proceedings/sugi28/103-28.pdf it appears to be either due to the select * or the max(). 

novinosrin
Tourmaline | Level 20

The remerge was done it chosen on purpose. That's basically is the logic to get the "want". I'm just waking up. I'll message you once I get to work with a fix

novinosrin
Tourmaline | Level 20

Hello @serena13lee 

 


data temp;
set have;
by id ;
retain e;
if first.id then  f=1;
else if start_date-e>1 then f+1;
e=end_date;
drop e;
run;

proc sql;
create table want(drop=f) as
select a.*,Flag
from
temp a inner join (select id,f, ifc(max(end_date)-min(start_date)+1>=15,'Y','N') as Flag from temp group by id,f) b
on a.id=b.id and a.f=b.f
order by id, start_date;
quit;
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

Are you sure that you want this record marked as Y

A005 3 19SEP2016 30SEP2016 Y

novinosrin
Tourmaline | Level 20

Hi @VDD   I think yes for the reason "irrespective of medication"

serena13lee
Quartz | Level 8
Hi! Yes since it contributes to the patient being on any medication >=15 consecutive days. Thanks for clarifying!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1644 views
  • 1 like
  • 4 in conversation