BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
saipriya92
Calcite | Level 5

Hi all, 

 

I have this dataset ;

sasuser.sample1_drugs

 Obs PTID DRUG_DT DRUG123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133

PT07837533331DEC2014ETOPOSIDE
PT07837533301JAN2015CISPLATIN
PT07837533301JAN2015ETOPOSIDE
PT07837533302JAN2015ETOPOSIDE
PT07837533321JAN2015CISPLATIN
PT07837533321JAN2015ETOPOSIDE
PT07837533322JAN2015ETOPOSIDE
PT07837533323JAN2015ETOPOSIDE
PT07837533311FEB2015CISPLATIN
PT07837533311FEB2015ETOPOSIDE
PT07837533312FEB2015ETOPOSIDE
PT07837533313FEB2015ETOPOSIDE
PT07837533304MAR2015CISPLATIN
PT07837533305MAR2015ETOPOSIDE
PT07837533306MAR2015ETOPOSIDE
PT07883721413NOV2013CARBOPLATIN
PT07883721413NOV2013ETOPOSIDE
PT07883721411DEC2013CARBOPLATIN
PT07883721411DEC2013ETOPOSIDE
PT07883721412DEC2013ETOPOSIDE
PT08238804418MAR2013CARBOPLATIN
PT08238804418MAR2013ETOPOSIDE
PT08238804419MAR2013ETOPOSIDE
PT08238804420MAR2013ETOPOSIDE
PT08238804408APR2013CARBOPLATIN
PT08238804408APR2013ETOPOSIDE
PT08238804409APR2013ETOPOSIDE
PT08238804410APR2013ETOPOSIDE
PT08238804401MAY2013CARBOPLATIN
PT08238804401MAY2013ETOPOSIDE
PT08238804402MAY2013ETOPOSIDE
PT08238804403MAY2013ETOPOSIDE
PT08238804422MAY2013CARBOPLATIN
PT08238804422MAY2013ETOPOSIDE
PT08238804423MAY2013ETOPOSIDE
PT08238804424MAY2013ETOPOSIDE
PT08238804419JAN2015CARBOPLATIN
PT08238804419JAN2015ETOPOSIDE
PT08238804420JAN2015ETOPOSIDE
PT08238804421JAN2015ETOPOSIDE
PT08238804409FEB2015CARBOPLATIN
PT08238804409FEB2015ETOPOSIDE
PT08238804410FEB2015ETOPOSIDE
PT08238804411FEB2015ETOPOSIDE
PT08238804403MAR2015CARBOPLATIN
PT08238804403MAR2015ETOPOSIDE
PT08238804404MAR2015ETOPOSIDE
PT08238804405MAR2015ETOPOSIDE
PT08238804424MAR2015CARBOPLATIN
PT08238804424MAR2015ETOPOSIDE
PT08238804425MAR2015ETOPOSIDE
PT08238804426MAR2015ETOPOSIDE
PT08238804402FEB2016CARBOPLATIN
PT08238804402FEB2016ETOPOSIDE
PT08238804403FEB2016ETOPOSIDE
PT08238804404FEB2016ETOPOSIDE
PT08238804423FEB2016CARBOPLATIN
PT08238804423FEB2016ETOPOSIDE
PT08238804424FEB2016ETOPOSIDE
PT08238804425FEB2016ETOPOSIDE
PT08238804415MAR2016CARBOPLATIN
PT08238804415MAR2016ETOPOSIDE
PT08238804416MAR2016ETOPOSIDE
PT08238804417MAR2016ETOPOSIDE
PT08358537717APR2012CARBOPLATIN
PT08358537717APR2012ETOPOSIDE
PT08358537718APR2012ETOPOSIDE
PT08358537719APR2012ETOPOSIDE
PT08358537709MAY2012CARBOPLATIN
PT08358537709MAY2012ETOPOSIDE
PT08358537710MAY2012ETOPOSIDE
PT08358537711MAY2012ETOPOSIDE
PT08358537730MAY2012CARBOPLATIN
PT08358537730MAY2012ETOPOSIDE
PT08358537731MAY2012ETOPOSIDE
PT08358537701JUN2012ETOPOSIDE
PT08358537720JUN2012CARBOPLATIN
PT08358537720JUN2012ETOPOSIDE
PT08358537721JUN2012ETOPOSIDE
PT08358537722JUN2012ETOPOSIDE
PT08358537725JUL2012CARBOPLATIN
PT08358537725JUL2012ETOPOSIDE
PT08358537726JUL2012ETOPOSIDE
PT08358537727JUL2012ETOPOSIDE
PT08358537715AUG2012CARBOPLATIN
PT08358537715AUG2012ETOPOSIDE
PT08358537716AUG2012ETOPOSIDE
PT08358537717AUG2012ETOPOSIDE
PT08358537703JAN2013CARBOPLATIN
PT08358537703JAN2013ETOPOSIDE
PT08358537722JAN2013CARBOPLATIN
PT08358537722JAN2013IRINOTECAN
PT08358537718FEB2013CARBOPLATIN
PT08358537718FEB2013IRINOTECAN
PT08358537725FEB2013CARBOPLATIN
PT08358537725FEB2013IRINOTECAN
PT08358537704MAR2013CARBOPLATIN
PT08358537704MAR2013IRINOTECAN
PT08358537716APR2013CARBOPLATIN
PT08358537716APR2013IRINOTECAN
PT08358537723APR2013CARBOPLATIN
PT08358537723APR2013IRINOTECAN
PT08358537730APR2013CARBOPLATIN
PT08358537730APR2013IRINOTECAN
PT08358537731MAY2013CARBOPLATIN
PT08358537731MAY2013IRINOTECAN
PT08358537707JUN2013CARBOPLATIN
PT08358537707JUN2013IRINOTECAN
PT08358537724JUN2013TOPOTECAN
PT08358537725JUN2013TOPOTECAN
PT08358537726JUN2013TOPOTECAN
PT08358537727JUN2013TOPOTECAN
PT08358537728JUN2013TOPOTECAN
PT08358537722JUL2013CARBOPLATIN
PT08358537722JUL2013IRINOTECAN
PT08358537722JUL2013TOPOTECAN
PT08358537729JUL2013TOPOTECAN
PT08358537730JUL2013TOPOTECAN
PT08358537731JUL2013TOPOTECAN
PT08358537701AUG2013TOPOTECAN
PT08358537702AUG2013TOPOTECAN
PT08358537726AUG2013TOPOTECAN
PT08358537727AUG2013TOPOTECAN
PT08358537728AUG2013TOPOTECAN
PT08358537729AUG2013TOPOTECAN
PT08358537716SEP2013DOCETAXEL
PT08358537708OCT2013DOCETAXEL
PT08358537711NOV2013GEMCITABINE
PT08358537718NOV2013GEMCITABINE
PT08358537709DEC2013GEMCITABINE
PT08358537723DEC2013GEMCITABINE
PT08358537730DEC2013GEMCITABINE
PT08358537706JAN2014GEMCITABINE

 

 

I have made this intermediate step ;

sasuser.sample1_step1

 Obs PTID DRUG STDT ENDDT12345678910111213141516

PT078375333Etoposide31DEC201406MAR2015
PT078375333Cisplatin01JAN201504MAR2015
PT078837214Carboplatin13NOV201311DEC2013
PT078837214Etoposide13NOV201312DEC2013
PT082388044Carboplatin18MAR201322MAY2013
PT082388044Etoposide18MAR201324MAY2013
PT082388044Carboplatin19JAN201524MAR2015
PT082388044Etoposide19JAN201526MAR2015
PT083585377Carboplatin17APR201215AUG2012
PT083585377Etoposide17APR201217AUG2012
PT083585377Etoposide03JAN201303JAN2013
PT083585377Carboplatin03JAN201322JUL2013
PT083585377Irinotecan22JAN201322JUL2013
PT083585377Topotecan24JUN201329AUG2013
PT083585377Docetaxel16SEP201308OCT2013
PT083585377Gemcitabine11NOV201306JAN2014

 

The code for the intermediate step is:

proc sort data=sasuser.sample1_drugs out=sample1_sorted;
	by ptid drug ;
run ;

data step1_1 ;
	set sample1_sorted ;
	by ptid drug ;
	if first.drug then gap=0;
	format lag_date date9. ;
	lag_date= lag(drug_dt);
	if drug_dt-lag_date >=60 /*and first.ptid=0*/ then gap + 1;
run ;

proc sql;
	create table step1_final as 
	select ptid, drug, min(drug_dt) as stdt format=date9. ,max(drug_dt) as enddt format=date9. 	
	from step1_1 
	group by ptid, drug, gap ;
quit ;

 

I want to create this final dataset ;

sasuser.sample1_step2

 Obs PTID REGIMEN SDATE EDATE12345678910111213141516171819202122

PT078375333Etoposide31DEC201431DEC2014
PT078375333Cisplatin+Etoposide01JAN201504MAR2015
PT078375333Etoposide05MAR201506MAR2015
PT078837214Carboplatin+Etoposide13NOV201311DEC2013
PT078837214Etoposide12DEC201312DEC2013
PT082388044Carboplatin+Etoposide18MAR201322MAY2013
PT082388044Etoposide23MAY201324MAY2013
PT082388044Blank25MAY201318JAN2015
PT082388044Carboplatin+Etoposide19JAN201524MAR2015
PT082388044Etoposide25MAR201526MAR2015
PT083585377Carboplatin+Etoposide17APR201215AUG2012
PT083585377Etoposide16AUG201217AUG2012
PT083585377Blank18AUG201202JAN2013
PT083585377Carboplatin+Etoposide03JAN201303JAN2013
PT083585377Carboplatin04JAN201321JAN2013
PT083585377Carboplatin+Irinotecan22JAN201323JUN2013
PT083585377Carboplatin+Irinotecan+Topotecan24JUN201322JUL2013
PT083585377Topotecan23JUL201329AUG2013
PT083585377Blank30AUG201315SEP2013
PT083585377Docetaxel16SEP201308OCT2013
PT083585377Blank09OCT201310NOV2013
PT083585377Gemcitabine11NOV201306JAN2014

 

How to code for this ?

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

Here is a "brute force" method, which simply generates one obs per date per regimen, and then sums that up:

data all_days;
  set step1_final;
  do date=stdt to enddt;
    output;
    end;
  drop stdt enddt;
run;

proc sort;
  by ptid date;
run;

data all_days2;
  do until(last.date);
    set all_days;
    by ptid date;
    length regimen $200;
    call catx('+',regimen,drug);
    end;
run;

data all_days3;
  set all_days2;
  by ptid date; 
  diff=dif(date);
  output;
  if not first.ptid and diff>1 then do;
    regimen='BLANK';
    do date=date-diff+1 to date-1;
      output;
      end;
    end;
run;

proc sort;
  by ptid date;
run;

data want;
  do until(last.regimen);
    set all_days3;
    by ptid regimen notsorted;
    if first.regimen then
      stdt=date;
    end;
  enddt=date;
  format stdt enddt date9.;
  keep ptid regimen stdt enddt;
run;

I realized that my first suggestion might create to many "BLANK" records if one patient was followed by another whose regimen started after the first patient's regimen ended. So I edited my response a bit to account for that (the datastep ALL_DAYS2 is split into two steps).

View solution in original post

4 REPLIES 4
PaigeMiller
Diamond | Level 26

Please explain the logic that enables you to get the final data set from the original data set.

--
Paige Miller
saipriya92
Calcite | Level 5

Step 1: Find the treatment periods of each individual drug for each patient. Use the drug encounter data to create periods of continuous treatment for each individual chemotherapy.  A period of continuous chemo administration starts on the first date of use and ends when there is a gap of greater than 60 days between 2 consecutive chemotherapy (of the same drug) administrations.

 

Step 2: Line up treatments and create the patient's treatment journey from start of first treatment to the end of the last treatment. Identify treatments that are being given concurrently. If there are no treatments given during a certain period, that period will be labeled as a blank regimen.

 

s_lassen
Meteorite | Level 14

Here is a "brute force" method, which simply generates one obs per date per regimen, and then sums that up:

data all_days;
  set step1_final;
  do date=stdt to enddt;
    output;
    end;
  drop stdt enddt;
run;

proc sort;
  by ptid date;
run;

data all_days2;
  do until(last.date);
    set all_days;
    by ptid date;
    length regimen $200;
    call catx('+',regimen,drug);
    end;
run;

data all_days3;
  set all_days2;
  by ptid date; 
  diff=dif(date);
  output;
  if not first.ptid and diff>1 then do;
    regimen='BLANK';
    do date=date-diff+1 to date-1;
      output;
      end;
    end;
run;

proc sort;
  by ptid date;
run;

data want;
  do until(last.regimen);
    set all_days3;
    by ptid regimen notsorted;
    if first.regimen then
      stdt=date;
    end;
  enddt=date;
  format stdt enddt date9.;
  keep ptid regimen stdt enddt;
run;

I realized that my first suggestion might create to many "BLANK" records if one patient was followed by another whose regimen started after the first patient's regimen ended. So I edited my response a bit to account for that (the datastep ALL_DAYS2 is split into two steps).

ed_sas_member
Meteorite | Level 14

Hi @saipriya92 ,

 

You can try this. Hope this help!

 

/* Expand dataset: create one row per day for each drug */

data sample1_step1_exp;
	set sample1_step1;
	format DAY date9.;
	do DAY=STDT to ENDDT;
		output;
	end;
run;

proc sort data=sample1_step1_exp;
	by PTID DAY;
run;

/* Transpose drugs to create combination regimens */

proc transpose data=sample1_step1_exp out=sample1_step1_tr (drop=_name_);
	var DRUG;
	by PTID DAY;
run;

data sample1_step1_regimen1;
	set sample1_step1_tr;
	
	REGIMEN = catx(" + ",of col:);
	
	keep PTID DAY REGIMEN;
run;

/* Creation of the variable COUNTER, which is incremented each time a new regimen is given */
data sample1_step1_regimen2;
	set sample1_step1_regimen1;
	
	by PTID REGIMEN notsorted;
	if first.PTID then COUNTER=0;

	if first.REGIMEN then COUNTER + 1;
run;

/* Creation of the final table without "blank" periods of time */
proc sql;
	create table sample1_step1_regimen3 as
	select PTID,
	min(DAY) as STDT format=date9.,
	max(DAY) as ENDDT format=date9.,
	REGIMEN
	from sample1_step1_regimen2 group by PTID, REGIMEN, COUNTER order by PTID, STDT ;
quit;

/* Identification of "blank" periods of time */
data sample1_step1_blank (keep= PTID REGIMEN _STDT_blank _ENDDT_blank );
	set sample1_step1_regimen3;
	format _STDT_blank _ENDDT_blank date9.;
	by PTID;
	_STDT_blank = lag(ENDDT) +1 ;
	if first.PTID or _STDT_blank = STDT then _STDT_blank=.;
	if _STDT_blank ne . then do;
			_ENDDT_blank = STDT - 1;
			REGIMEN = "Blank";
			output;
		end;
run;

/* Creation of the final table (retrieve "blank" periods of time) */
data sample1_step2;
	set sample1_step1_regimen3 sample1_step1_blank (rename=(_STDT_blank= STDT _ENDDT_blank=ENDDT));
run;

proc sort data=sample1_step2;
	by PTID STDT;
run;

Output:

Capture d’écran 2019-12-03 à 20.20.32.png

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1345 views
  • 1 like
  • 4 in conversation