Identify treatment discontinuation date

riya275 · Posted 03-11-2020 05:33 AM

Hi ,

I have data that looks like -

ID	Drug	Episode	start_date	end_date
1	A	1	1-Jan	30-Mar
1	A	2	1-May	3-Jul
1	A	3	28-Sep	28-Oct
1	A	4	1-Nov	30-Nov
1	B	1	1-Apr	10-May
1	B	2	2-Nov	28-Dec
1	C	1	1-Jul	2-Aug

data abc;
input ID $ drug $ episode start_date date9. end_date date9.;
format start_date end_date date9.;
informat start_date end_date date9.;
datalines ;
1 A 1 01Jan2012 30Mar2012
1 A 2 01May2012 03Jul2012
1 A 3 28Sep2012 28Oct2012
1 A 4 01Nov2012 30Dec2012
1 B 1 01Apr2012 10May2012
1 B 2 02Nov2012 28Dec2012
1 B 3 01Jan2012 30Mar2012
1 C 1 01Jul2012 02Aug2012
;
run;

Here we have subjects and the the drugs they take. A new episode of one drug means that the person discontinued.

If the start date ( start date of 1st episode) of second drug consumed , lies in between the episodes of first drug , then we will ignore all the further episodes of 1st drug.

Eg. here 1 april ( start date of drug B) lies after the first episode of drug A, so episode 2,3,4 of drug A would be deleted.

Similarly the start date for drug C lies after the end date of episode 1 for drug B then episode 2 of drug B would be deleted.

The maximum number of episodes a subject can have is 15.

The resultant dataset should look like -

ID	Drug	Episode	start_date	end_date
1	A	1	1-Jan	30-Mar
1	B	1	1-Apr	10-May
1	C	1	1-Jul	2-Aug

andreas_lds · Posted 03-11-2020 05:51 AM

Can you please post what you expect as result? And please post data in usable form, especially the date-variables are important, because a date is not a date if the year is missing.

Kurt_Bremser · Posted 03-11-2020 05:51 AM

Please post example data in a WORKING data step with datalines, and make sure you have usable data. Dates without years are useless.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Kurt_Bremser · Posted 03-12-2020 08:29 AM

Shouldn't we have two observations for drug B as the last observation of B ends (30mar2012) before the start of C?

Anyway, this is what I come up with:

data abc;
input ID $ drug $ episode start_date :date9. end_date :date9.;
format start_date end_date date9.;
informat start_date end_date date9.;
datalines ;
1 A 1 01Jan2012 30Mar2012
1 A 2 01May2012 03Jul2012
1 A 3 28Sep2012 28Oct2012
1 A 4 01Nov2012 30Dec2012
1 B 1 01Apr2012 10May2012
1 B 2 02Nov2012 28Dec2012
1 B 3 01Jan2012 30Mar2012
1 C 1 01Jul2012 02Aug2012
;

data firsts (keep=id count start_date rename=(start_date=first_date));
set abc;
retain count;
by id drug;
if first.id then count = 0;
if first.drug;
count + 1;
run;

data want;
set abc;
by id drug;
retain count;
if _n_ = 1
then do;
  declare hash f (dataset:"firsts");
  f.definekey("id","count");
  f.definedata("first_date");
  f.definedone();
  call missing(first_date);
end;
if first.id then count = 1;
if first.drug then count + 1;
rc = f.find();
put id= drug= count= rc= start_date first_date;
if rc = 0 and start_date > first_date then delete;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Identify treatment discontinuation date

Re: Identify treatment discontinuation date

Re: Identify treatment discontinuation date

Re: Identify treatment discontinuation date

Identify treatment discontinuation date

Re: Identify treatment discontinuation date

Re: Identify treatment discontinuation date

Re: Identify treatment discontinuation date

Registration is open

SAS Training: Just a Click Away