Hi , I have data that looks like - ID Drug Episode start_date end_date 1 A 1 1-Jan 30-Mar 1 A 2 1-May 3-Jul 1 A 3 28-Sep 28-Oct 1 A 4 1-Nov 30-Nov 1 B 1 1-Apr 10-May 1 B 2 2-Nov 28-Dec 1 C 1 1-Jul 2-Aug data abc;
input ID $ drug $ episode start_date date9. end_date date9.;
format start_date end_date date9.;
informat start_date end_date date9.;
datalines ;
1 A 1 01Jan2012 30Mar2012
1 A 2 01May2012 03Jul2012
1 A 3 28Sep2012 28Oct2012
1 A 4 01Nov2012 30Dec2012
1 B 1 01Apr2012 10May2012
1 B 2 02Nov2012 28Dec2012
1 B 3 01Jan2012 30Mar2012
1 C 1 01Jul2012 02Aug2012
;
run; Here we have subjects and the the drugs they take. A new episode of one drug means that the person discontinued. If the start date ( start date of 1st episode) of second drug consumed , lies in between the episodes of first drug , then we will ignore all the further episodes of 1st drug. Eg. here 1 april ( start date of drug B) lies after the first episode of drug A, so episode 2,3,4 of drug A would be deleted. Similarly the start date for drug C lies after the end date of episode 1 for drug B then episode 2 of drug B would be deleted. The maximum number of episodes a subject can have is 15. The resultant dataset should look like - ID Drug Episode start_date end_date 1 A 1 1-Jan 30-Mar 1 B 1 1-Apr 10-May 1 C 1 1-Jul 2-Aug
... View more