Solved: Re: How do i make exception when comparing dates and flag

anandrc · Posted 04-07-2025 06:18 AM

Apprectiate your advise

I have a dataset where i need new flag variable for sequential or combination therapy

Patient	Treatment	Start	End		Flag

E1	A	10-Apr-17	26-Jun-17	Seq	A
E1	B	07-Jun-18	08-Aug-18	Seq	A

E2	B	06-Sep-16	20-Oct-16	Seq	B
E2	A	15-Nov-17	04-Oct-18	Seq	B

E3	A	07-Dec-10	08-Feb-11	Seq	A
E3	A	06-Sep-16	20-Oct-16	Seq	A
E3	B	15-Nov-17	04-Oct-18	Seq	A

E4	B	07-Dec-10	08-Feb-11	Seq	B
E4	B	06-Sep-16	20-Oct-16	Seq	B
E4	A	15-Nov-17	04-Oct-18	Seq	B

E5	A	27-Feb-18	20-Nov-18	Combi	C
E5	B	22-May-18	30-Oct-18	Combi	C

E7	A	01-Feb-16	28-Apr-16	Seq	A
E7	A	20-Apr-17	16-May-17	Seq	A
E7	B	21-Aug-17	02-Jan-19	Seq	A
E7	A	27-May-19	29-Jul-19	Seq	A

E8	B	01-Feb-16	28-Apr-16	Seq	B
E8	B	20-Apr-17	16-May-17	Seq	B
E8	A	21-Aug-17	02-Jan-19	Seq	B
E8	B	27-May-19	29-Jul-19	Seq	B

treatment can be sequential (prev trt has ended before start of next trt)
treatment can be combination (prev trt has not ended before start of next trt)

If the patient has an end date for one prior therapy that occurs on or before the start date of another prior therapy,
then assign A or B depending on which starts first
Ex: scenarios 1,2,3 and 4

If the patient doesn’t have an end date for a prior therapy that occurs on or before the start date of another prior therapy,
then assign C.
Ex: scenario 5

Also, For scenarios 3 and 4,
we take the min (start) and max (end) per treatment for comparing.
For Ex: Patient E3, A trt has min (start) as 07-Dec-10 and max (end) as 20-Oct-16 which has ended before min(start) of B (15-Nov-17)

However scenarios 5 and 6 we need to make few exceptions.

when we take the min (start) and max (end) per treatment, they get flagged as combination although they are sequential
For Ex: Patient E7, A trt has min (start) as 01-Feb-16 and max (end) as 29-Jul-19.
So when code compares Trt A max(end) 29-Jul-19 with Trt B min(start) 21-Aug-17, it treats as A has not ended before start of B and hence flags as Combination.

Similar for Patient E8.

How to tell the program to make an exception and not count this as combination?
It should be treated as sequential

Similar exceptions should be made for
A B B A, A B B A A etc

Kurt_Bremser · Posted 04-14-2025 03:56 PM

See if this does it:

data want;
if 0 then set have;
flag1 = "Seq  ";
do until (last.patient);
  set have;
  by patient notsorted;
  if first.patient then flag2 = treatment;
  if
    not first.patient
    and treatment ne lag(treatment)
    and (start lt lag(end) or start eq lag(start))
  then do;
    flag1 = "Combi";
    flag2 = "C";
  end;
end;
do until (last.patient);
  set have;
  by patient notsorted;
  output;
end;
run;

If not, provide an example where it fails.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

Kurt_Bremser · Posted 04-07-2025 04:07 PM

data have;
infile datalines dlm='09'x dsd truncover;
input
  patient $
  treatment $
  start :date9.
  end :date9.
;
format start end date9.;
datalines;
E1	A	10-Apr-17	26-Jun-17	Seq	A
E1	B	07-Jun-18	08-Aug-18	Seq	A
E2	B	06-Sep-16	20-Oct-16	Seq	B
E2	A	15-Nov-17	04-Oct-18	Seq	B
E3	A	07-Dec-10	08-Feb-11	Seq	A
E3	A	06-Sep-16	20-Oct-16	Seq	A
E3	B	15-Nov-17	04-Oct-18	Seq	A
E4	B	07-Dec-10	08-Feb-11	Seq	B
E4	B	06-Sep-16	20-Oct-16	Seq	B
E4	A	15-Nov-17	04-Oct-18	Seq	B
E5	A	27-Feb-18	20-Nov-18	Combi	C
E5	B	22-May-18	30-Oct-18	Combi	C
E7	A	01-Feb-16	28-Apr-16	Seq	A
E7	A	20-Apr-17	16-May-17	Seq	A
E7	B	21-Aug-17	02-Jan-19	Seq	A
E7	A	27-May-19	29-Jul-19	Seq	A
E8	B	01-Feb-16	28-Apr-16	Seq	B
E8	B	20-Apr-17	16-May-17	Seq	B
E8	A	21-Aug-17	02-Jan-19	Seq	B
E8	B	27-May-19	29-Jul-19	Seq	B
;

data want;
if 0 then set have;
flag1 = "Seq  ";
do until (last.patient);
  set have;
  by patient;
  if first.patient then flag2 = treatment;
  if
    not first.patient
    and treatment ne lag(treatment)
    and start lt lag(end)
  then do;
    flag1 = "Combi";
    flag2 = "C";
  end;
end;
do until (last.patient);
  set have;
  by patient;
  output;
end;
run;

Gives the same result that you show in your post.

Please post example data as a DATA step with DATALINES in the future, like I do here.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

quickbluefish · Posted 04-08-2025 06:14 AM

If your actual data are more complicated than this, e.g., many kinds of treatments or other time-varying exposures, or if you need to assess length of overlap, gaps, adherence, etc., I would recommend you convert this into a 'counting process' format wherein each row represents a period of time during which the exposure profile of a patient is static. I use a macro for this, but there are various ways out there to do this. Having data in this format will make it very simple to answer your questions about combination vs. sequential therapy. It may be overkill if you really just have two drugs and simply want to know whether there was ever any overlap, of course. Here's an example, using @Kurt_Bremser's input dataset followed by conversion into an input dataset for the macro. Note that the startdate/enddate variables that are being created in this case are just the earliest start and latest end for each patient, but that's not required - they should instead be the start / end of follow-up for the person if that information is available.

proc datasets lib=work memtype=data nolist nodetails kill; run; quit;

data have;
infile datalines dlm='09'x dsd truncover;
input
  patient $
  treatment $
  start :date9.
  end :date9.
;
format start end date9.;
datalines;
E1	A	10-Apr-17	26-Jun-17	Seq	A
E1	B	07-Jun-18	08-Aug-18	Seq	A
E2	B	06-Sep-16	20-Oct-16	Seq	B
E2	A	15-Nov-17	04-Oct-18	Seq	B
E3	A	07-Dec-10	08-Feb-11	Seq	A
E3	A	06-Sep-16	20-Oct-16	Seq	A
E3	B	15-Nov-17	04-Oct-18	Seq	A
E4	B	07-Dec-10	08-Feb-11	Seq	B
E4	B	06-Sep-16	20-Oct-16	Seq	B
E4	A	15-Nov-17	04-Oct-18	Seq	B
E5	A	27-Feb-18	20-Nov-18	Combi	C
E5	B	22-May-18	30-Oct-18	Combi	C
E7	A	01-Feb-16	28-Apr-16	Seq	A
E7	A	20-Apr-17	16-May-17	Seq	A
E7	B	21-Aug-17	02-Jan-19	Seq	A
E7	A	27-May-19	29-Jul-19	Seq	A
E8	B	01-Feb-16	28-Apr-16	Seq	B
E8	B	20-Apr-17	16-May-17	Seq	B
E8	A	21-Aug-17	02-Jan-19	Seq	B
E8	B	27-May-19	29-Jul-19	Seq	B
;
run;

proc sql;
create table forCP as
select a.patient, a.startdate, a.enddate, 
b.treatment as event, b.start as edate length=4 format=date9.,
b.end-b.start as days length=4
from
	(select patient, min(start) as startdate length=4 format=date9.,
	max(end) as enddate length=4 format=date9. from have group by patient) A
	left join
	have B
	on a.patient=b.patient
order by a.patient, edate, event;
quit;

%include "/path/to/macro/cp.sas";

%cp(
	forCP,
	ptid=patient
	);
	
title 'first 50 obs of output data';
proc print data=cp (obs=50) width=min; run;
title;

Output from proc print looks like this -- combination therapy, in this case, are simply rows where both A and B are 1. Length of the window is given by LEN and winstart/winend are the bounds of that window.

anandrc · Posted 04-14-2025 02:15 AM

Appreciate the response.

In this instance, i do have only 2 treatment but looks like have to introduce a 30 day overlap. Can i please know how to access cp.sas program which creates the winstart and winend etc

How do i ignore the first line of treatment as its more than 30 days overlap between the end of first treatment and start of second treatment and only consider second A treatment for flagging purpose

I have an instance for example E10 patient listed below -

Patient	Treatment	Start	End
E10	A	13-Jul-15	21-Aug-15
E10	A	27-Apr-21	27-Apr-21
E10	B	27-Apr-21	27-Sep-21

Also, introduce a rule that trumps everything when it find combination first like when the start dates match.
For Example for E9

Patient	Treatment	Start	End
E9	A	01-Jan-21	01-Jan-21
E9	B	01-Jan-21	01-Jan-21

Thanks

Kurt_Bremser · Posted 04-14-2025 04:16 AM

Quote from myself:

Please post example data as a DATA step with DATALINES in the future

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

anandrc · Posted 04-14-2025 05:09 AM

Apologies.

data have;
infile datalines dlm='09'x dsd truncover;
input
patient $
treatment $
start :date9.
end :date9.
;
format start end date9.;
datalines;
E1 A 10-Apr-17 26-Jun-17 Seq A
E1 B 07-Jun-18 08-Aug-18 Seq A
E2 B 06-Sep-16 20-Oct-16 Seq B
E2 A 15-Nov-17 04-Oct-18 Seq B
E3 A 07-Dec-10 08-Feb-11 Seq A
E3 A 06-Sep-16 20-Oct-16 Seq A
E3 B 15-Nov-17 04-Oct-18 Seq A
E4 B 07-Dec-10 08-Feb-11 Seq B
E4 B 06-Sep-16 20-Oct-16 Seq B
E4 A 15-Nov-17 04-Oct-18 Seq B
E5 A 27-Feb-18 20-Nov-18 Combi C
E5 B 22-May-18 30-Oct-18 Combi C
E7 A 01-Feb-16 28-Apr-16 Seq A
E7 A 20-Apr-17 16-May-17 Seq A
E7 B 21-Aug-17 02-Jan-19 Seq A
E7 A 27-May-19 29-Jul-19 Seq A
E8 B 01-Feb-16 28-Apr-16 Seq B
E8 B 20-Apr-17 16-May-17 Seq B
E8 A 21-Aug-17 02-Jan-19 Seq B
E8 B 27-May-19 29-Jul-19 Seq B
E9 A 01-Jan-21 01-Jan-21 Combi C
E9 B 01-Jan-21 01-Jan-21 Combi C
E10 A 13-Jul-15 21-Aug-15 Combi C
E10 A 27-Apr-21 27-Apr-21 Combi C
E10 B 27-Apr-21 27-Sep-21 Combi C
;
run;

Kurt_Bremser · Posted 04-14-2025 03:56 PM

See if this does it:

data want;
if 0 then set have;
flag1 = "Seq  ";
do until (last.patient);
  set have;
  by patient notsorted;
  if first.patient then flag2 = treatment;
  if
    not first.patient
    and treatment ne lag(treatment)
    and (start lt lag(end) or start eq lag(start))
  then do;
    flag1 = "Combi";
    flag2 = "C";
  end;
end;
do until (last.patient);
  set have;
  by patient notsorted;
  output;
end;
run;

If not, provide an example where it fails.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

quickbluefish · Posted 04-14-2025 06:32 PM

This is the counting process macro I'm using:

https://github.com/Jeremy-Smith5/CEP-public/blob/main/SAS/cp.sas

...it's old, and a bit of a Rube Goldberg contraption, but works as long as you follow the instructions. The key thing is that the things you provide in the 'EVENT' variable must themselves be named in such a way that they could be valid (version 7) variable names. In other words, if your unique events are: DrugA, DrugB, DrugC, HospStay, Pneumonia - those are fine as names. But Drug A, Hospital Stay, etc. will not work with the current set up. The counting process data format, however you choose to go about creating it, is transformative for longitudinal work, esp. pharmepi, in my view.

anandrc · Posted 04-22-2025 01:13 PM

Thankyou for the suggestion. I will try this for treatments with more than 3+.
Really appreciate

anandrc · Posted 04-14-2025 02:27 AM

Appreciate the response. Very useful.

Rules are assiging the correct flag, but do have couple of scenarios to consider. Apologies, did not foresee these exceptions

For ex, in the below scenario, current code flags it as A, but when the start dates match, We need put a rule before that trumps it to find combination first and Flag it a C

Patient	Treatment	Start	End
E9	A	01-Jan-21	01-Jan-21
E9	B	01-Jan-21	01-Jan-21

For second scenario,

looks like i have to introduce a 30 day overlap.

In the ex below, I need to ignore the first line of treatement as the overlap is more than 30 days between end of first treatment and start of second treatment and only consider second A treatment for flagging purpose. Current code flags it as A but if we ignore the first line of treatment, as start dates match it should be combination C.

Patient	Treatment	Start	End
E10	A	13-Jul-15	21-Aug-15
E10	A	27-Apr-21	27-Apr-21
E10	B	27-Apr-21	27-Sep-21

mkeintz · Posted 04-14-2025 08:20 PM

You can set up a HISTORY array (one element per date from the earliest possible to latest possible date). Pass through each patient twice. Initialize each patient to class='Seq ' and flag=treatment of the first record.

During the first pass, update the history array. If a date is encountered that has more than one treatment, set class to 'COMBI' and flag to 'C', ... and stop monitoring dates - you won't be going back from Combi to Seq.

During the second pass, do nothing but permit the observations to be output, using the CLASS and FLAG values retained from the first pass:

data have;
infile datalines ;;
input
  patient $2.  treatment :$1.  start :date9.  end :date9.   
     _class :$5.   _flag :$1. ;
format start end date9.;
datalines;
E1 A 10Apr2017 26Jun2017 Seq A
E1 B 07Jun2018 08Aug2018 Seq A
E2 B 06Sep2016 20Oct2016 Seq B
E2 A 15Nov2017 04Oct2018 Seq B
E3 A 07Dec2010 08Feb2011 Seq A
E3 A 06Sep2016 20Oct2016 Seq A
E3 B 15Nov2017 04Oct2018 Seq A
E4 B 07Dec2010 08Feb2011 Seq B
E4 B 06Sep2016 20Oct2016 Seq B
E4 A 15Nov2017 04Oct2018 Seq B
E5 A 27Feb2018 20Nov2018 Combi C
E5 B 22May2018 30Oct2018 Combi C
E7 A 01Feb2016 28Apr2016 Seq A
E7 A 20Apr2017 16May2017 Seq A
E7 B 21Aug2017 02Jan2019 Seq A
E7 A 27May2019 29Jul2019 Seq A
E8 B 01Feb2016 28Apr2016 Seq B
E8 B 20Apr2017 16May2017 Seq B
E8 A 21Aug2017 02Jan2019 Seq B
E8 B 27May2019 29Jul2019 Seq B
run;

%let beg=01jan2010;
%let end=31dec2019;
data want (drop=d);
  set have (in=firstpass) have (in=secondpass);
  by patient;

  retain class '     ' Flag ' ' ;
  array history {%sysevalf("&beg"d):%sysevalf("&end"d)}  _temporary_;

  if first.patient then do;
    call missing(of history{*});
    class='Seq  ';
    flag=treatment;
  end;

  if firstpass=1 and class='Seq' then do d=start to end while (class='Seq');
    history{d}+1;
    if history{d}>1 then do;
      class='Combi';
      flag='C';
    end;
  end;
  if secondpass;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

anandrc · Posted 04-22-2025 01:15 PM

Thankyou for the suggestion. I will save this and try this solution

Registration is open

SAS Training: Just a Click Away