Solved: Calculate sum of dosage for data containing overlapping periods of tim...

MonicaB · Posted 05-12-2022 04:47 AM

Hi everyone,

The study in this example was a double-blind study comparing 10, 20, and 30 mg of Drug X once daily vs Placebo. Study treatment was given as one tablet each from Bottles A, B, and C taken together once daily.
The subject in this example took:
• 1 tablet from Bottles A, B and C from 2011-01-14 to 2011-01-20
• 0 tablets from Bottle B on 2011-01-21, then 2 tablets on 2011-01-22
• 1 tablet from Bottles A and C on 2011-01-21 and 2011-01-22
• 1 tablet from Bottles A, B and C from 2011-01-23 to 2011-01-28
The EC dataset shows administrations as collected, in tablets.

Upon unmasking, it became known that the subject was randomized to Drug X 20 mg and that:
• Bottle A contained 10 mg/tablet.
• Bottle B contained 10 mg/tablet.
• Bottle C contained Placebo (i.e., 0 mg of active ingredient/tablet).
The EX dataset shows the doses administered in the protocol-specified unit (mg). The sponsor considered an administration to consist of the total amount for Bottles A, B, and C.

I was having trouble converting the core variable (ecdose) from EC dataset into the desired data structure in EX dataset. I've tried different approaches to obtain the sum of dosage for each non-overlapping time interval, but have failed to get the correct results. My code is shown below:

data have;

input id $ ectrt $8. dose    
    ecstdtc  yymmdd10. +1
    ecendtc  yymmdd10.
    ecstdy   ecendy;

format 
   ecstdtc  yymmdd10.
    ecendtc  yymmdd10.;

datalines;
abc4001 BOTTLE A 1 2011-01-14 2011-01-28 1 15
abc4001 BOTTLE C 1 2011-01-14 2011-01-28 1 15
abc4001 BOTTLE B 1 2011-01-14 2011-01-20 1 7
abc4001 BOTTLE B . 2011-01-21 2011-01-21 8 8
abc4001 BOTTLE B 2 2011-01-22 2011-01-22 9 9
abc4001 BOTTLE B 1 2011-01-23 2011-01-28 10 10
;
run;

data have1;
	set have;
	if ectrt='BOTTLE A' then exdoseu=10;
	else if ectrt='BOTTLE B' then exdoseu=10;
	else if ectrt='BOTTLE C' then exdoseu=0;
	exdose=exdoseu*dose;
	if exdose=. then exdose=0;

run;

proc sql;
	create table sum as
	select a.id, a.ecstdtc,a.ecendtc,a.ecstdy,a.ecendy,
		   sum(b.exdose) as sum

	from have1 as a,
         have1 as b
    where (a.id = b.id and b.ecendtc >= a.ecstdtc and b.ecendtc <= a.ecendtc)

    group by a.id, 
       a.ecstdtc, a.ecendtc
       ;
quit;

Any suggestions or ideas would be highly appreciated. Thank you so much in advance!!

mkeintz · Posted 05-12-2022 10:36 AM

First, thank you for providing sample data in a working DATA step, as well as your subsequent code.

I leave it to folks far more expert to find SQL code to do what you want - somehow it needs to track when there is a change in SUM and then determine the corresponding range of days and dates.

But I do think this DATA step code, which uses temporary arrays to hold accumulated doseage by day, will do what you want:

data have;
input id $ ectrt $8. dose    
    ecstdtc  yymmdd10. +1
    ecendtc  yymmdd10.
    ecstdy   ecendy;

format 
   ecstdtc  yymmdd10.
    ecendtc  yymmdd10.;

datalines;
abc4001 BOTTLE A 1 2011-01-14 2011-01-28 1 15
abc4001 BOTTLE C 1 2011-01-14 2011-01-28 1 15
abc4001 BOTTLE B 1 2011-01-14 2011-01-20 1 7
abc4001 BOTTLE B . 2011-01-21 2011-01-21 8 8
abc4001 BOTTLE B 2 2011-01-22 2011-01-22 9 9
abc4001 BOTTLE B 1 2011-01-23 2011-01-28 10 15      **replace 10 with 15**
;
run;

data have1;
	set have;
	if ectrt='BOTTLE A' then exdoseu=10;
	else if ectrt='BOTTLE B' then exdoseu=10;
	else if ectrt='BOTTLE C' then exdoseu=0;
	exdose=exdoseu*dose;
	if exdose=. then exdose=0;

run;



data want (keep=id ecstdtc ecendtc ecstdy ecendy sum);
  set have1 ;
  by id;
  array dosesum {30} _temporary_;       /*Total doses, by day */

  retain _maxecendy _ecstdtc1;

  if first.id then do;
    _maxecendy=1;
    _ecstdtc1=ecstdtc;
    do _d=1 to dim(dosesum);
      dosesum{_d}=0;
    end;
  end;

  _maxecendy=max(_maxecendy,ecendy);      /*Track highest ECENDY*/
  do _d=ecstdy to ecendy;
    dosesum{_d}=dosesum{_d}+exdose;
  end;

  if last.id;

  /*Now scan the daily totals, looking for change in dose*/
  ecendy=0;
  do until (ecendy=_maxecendy);
    ecstdy=ecendy+1;
    do ecendy=ecstdy to _maxecendy until(dosesum{ecendy+1}^=dosesum{ecstdy});
    end;

    ecstdtc=_ecstdtc1+ ecstdy-1;
    ecendtc=_ecstdtc1+ ecendy-1;
    sum=dosesum{ecstdy};
    output;

  end;
run;

The last record of your sample inline data originally specified day 10 through 10. This disagreed with the corresponding calendar dates, and with the EC table. So I changed the 10 to a 15 in my code above.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

mkeintz · Posted 05-12-2022 09:44 AM

I want to be sure I understand the intended structure in the EX dataset. It appears to me that you want a chronological series of nonoverlapping records for a given id, such that each record has a constant dose level, and each record has a different dose level from the immediately preceding and immediately succeeding record.

I ask because at first, based on your EC data structure (which had a day 10 through day 10 record), I thought you might have a record for day 10 through 10, and then 11 through 15. But both of them would have a dose of 10, so you collapsed them into days 10 through 15, correct?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

MonicaB · Posted 05-12-2022 10:25 PM

Hello, thank you so much for your response!

For "I want to be sure I understand the intended structure in the EX dataset. It appears to me that you want a chronological series of nonoverlapping records for a given id, such that each record has a constant dose level, and each record has a different dose level from the immediately preceding and immediately succeeding record." - Yes, it is correct. And for the day 10 through day 10 record, I apologize for this typo. It was supposed to be day 10 through day 15 as it shown in the picture.

mkeintz · Posted 05-12-2022 10:36 AM

First, thank you for providing sample data in a working DATA step, as well as your subsequent code.

I leave it to folks far more expert to find SQL code to do what you want - somehow it needs to track when there is a change in SUM and then determine the corresponding range of days and dates.

But I do think this DATA step code, which uses temporary arrays to hold accumulated doseage by day, will do what you want:

data have;
input id $ ectrt $8. dose    
    ecstdtc  yymmdd10. +1
    ecendtc  yymmdd10.
    ecstdy   ecendy;

format 
   ecstdtc  yymmdd10.
    ecendtc  yymmdd10.;

datalines;
abc4001 BOTTLE A 1 2011-01-14 2011-01-28 1 15
abc4001 BOTTLE C 1 2011-01-14 2011-01-28 1 15
abc4001 BOTTLE B 1 2011-01-14 2011-01-20 1 7
abc4001 BOTTLE B . 2011-01-21 2011-01-21 8 8
abc4001 BOTTLE B 2 2011-01-22 2011-01-22 9 9
abc4001 BOTTLE B 1 2011-01-23 2011-01-28 10 15      **replace 10 with 15**
;
run;

data have1;
	set have;
	if ectrt='BOTTLE A' then exdoseu=10;
	else if ectrt='BOTTLE B' then exdoseu=10;
	else if ectrt='BOTTLE C' then exdoseu=0;
	exdose=exdoseu*dose;
	if exdose=. then exdose=0;

run;



data want (keep=id ecstdtc ecendtc ecstdy ecendy sum);
  set have1 ;
  by id;
  array dosesum {30} _temporary_;       /*Total doses, by day */

  retain _maxecendy _ecstdtc1;

  if first.id then do;
    _maxecendy=1;
    _ecstdtc1=ecstdtc;
    do _d=1 to dim(dosesum);
      dosesum{_d}=0;
    end;
  end;

  _maxecendy=max(_maxecendy,ecendy);      /*Track highest ECENDY*/
  do _d=ecstdy to ecendy;
    dosesum{_d}=dosesum{_d}+exdose;
  end;

  if last.id;

  /*Now scan the daily totals, looking for change in dose*/
  ecendy=0;
  do until (ecendy=_maxecendy);
    ecstdy=ecendy+1;
    do ecendy=ecstdy to _maxecendy until(dosesum{ecendy+1}^=dosesum{ecstdy});
    end;

    ecstdtc=_ecstdtc1+ ecstdy-1;
    ecendtc=_ecstdtc1+ ecendy-1;
    sum=dosesum{ecstdy};
    output;

  end;
run;

The last record of your sample inline data originally specified day 10 through 10. This disagreed with the corresponding calendar dates, and with the EC table. So I changed the 10 to a 15 in my code above.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

MonicaB · Posted 05-13-2022 05:41 AM

Hello, thank you so much for your help! This solution is absolutely amazing. I've learned a lot from it. Thank you again!

Calculate sum of dosage for data containing overlapping periods of time

Re: Calculate sum of dosage for data containing overlapping periods of time

Re: Calculate sum of dosage for data containing overlapping periods of time

Re: Calculate sum of dosage for data containing overlapping periods of time

Re: Calculate sum of dosage for data containing overlapping periods of time

Re: Calculate sum of dosage for data containing overlapping periods of time

Registration is open

Calculate sum of dosage for data containing overlapping periods of time

Re: Calculate sum of dosage for data containing overlapping periods of time

Re: Calculate sum of dosage for data containing overlapping periods of time

Re: Calculate sum of dosage for data containing overlapping periods of time

Re: Calculate sum of dosage for data containing overlapping periods of time

Re: Calculate sum of dosage for data containing overlapping periods of time

Registration is open

SAS Training: Just a Click Away