Hi everyone,
The study in this example was a double-blind study comparing 10, 20, and 30 mg of Drug X once daily vs Placebo. Study treatment was given as one tablet each from Bottles A, B, and C taken together once daily.
The subject in this example took:
• 1 tablet from Bottles A, B and C from 2011-01-14 to 2011-01-20
• 0 tablets from Bottle B on 2011-01-21, then 2 tablets on 2011-01-22
• 1 tablet from Bottles A and C on 2011-01-21 and 2011-01-22
• 1 tablet from Bottles A, B and C from 2011-01-23 to 2011-01-28
The EC dataset shows administrations as collected, in tablets.
Upon unmasking, it became known that the subject was randomized to Drug X 20 mg and that:
• Bottle A contained 10 mg/tablet.
• Bottle B contained 10 mg/tablet.
• Bottle C contained Placebo (i.e., 0 mg of active ingredient/tablet).
The EX dataset shows the doses administered in the protocol-specified unit (mg). The sponsor considered an administration to consist of the total amount for Bottles A, B, and C.
I was having trouble converting the core variable (ecdose) from EC dataset into the desired data structure in EX dataset. I've tried different approaches to obtain the sum of dosage for each non-overlapping time interval, but have failed to get the correct results. My code is shown below:
data have;
input id $ ectrt $8. dose
ecstdtc yymmdd10. +1
ecendtc yymmdd10.
ecstdy ecendy;
format
ecstdtc yymmdd10.
ecendtc yymmdd10.;
datalines;
abc4001 BOTTLE A 1 2011-01-14 2011-01-28 1 15
abc4001 BOTTLE C 1 2011-01-14 2011-01-28 1 15
abc4001 BOTTLE B 1 2011-01-14 2011-01-20 1 7
abc4001 BOTTLE B . 2011-01-21 2011-01-21 8 8
abc4001 BOTTLE B 2 2011-01-22 2011-01-22 9 9
abc4001 BOTTLE B 1 2011-01-23 2011-01-28 10 10
;
run;
data have1;
set have;
if ectrt='BOTTLE A' then exdoseu=10;
else if ectrt='BOTTLE B' then exdoseu=10;
else if ectrt='BOTTLE C' then exdoseu=0;
exdose=exdoseu*dose;
if exdose=. then exdose=0;
run;
proc sql;
create table sum as
select a.id, a.ecstdtc,a.ecendtc,a.ecstdy,a.ecendy,
sum(b.exdose) as sum
from have1 as a,
have1 as b
where (a.id = b.id and b.ecendtc >= a.ecstdtc and b.ecendtc <= a.ecendtc)
group by a.id,
a.ecstdtc, a.ecendtc
;
quit;
Any suggestions or ideas would be highly appreciated. Thank you so much in advance!!
First, thank you for providing sample data in a working DATA step, as well as your subsequent code.
I leave it to folks far more expert to find SQL code to do what you want - somehow it needs to track when there is a change in SUM and then determine the corresponding range of days and dates.
But I do think this DATA step code, which uses temporary arrays to hold accumulated doseage by day, will do what you want:
data have;
input id $ ectrt $8. dose
ecstdtc yymmdd10. +1
ecendtc yymmdd10.
ecstdy ecendy;
format
ecstdtc yymmdd10.
ecendtc yymmdd10.;
datalines;
abc4001 BOTTLE A 1 2011-01-14 2011-01-28 1 15
abc4001 BOTTLE C 1 2011-01-14 2011-01-28 1 15
abc4001 BOTTLE B 1 2011-01-14 2011-01-20 1 7
abc4001 BOTTLE B . 2011-01-21 2011-01-21 8 8
abc4001 BOTTLE B 2 2011-01-22 2011-01-22 9 9
abc4001 BOTTLE B 1 2011-01-23 2011-01-28 10 15 **replace 10 with 15**
;
run;
data have1;
set have;
if ectrt='BOTTLE A' then exdoseu=10;
else if ectrt='BOTTLE B' then exdoseu=10;
else if ectrt='BOTTLE C' then exdoseu=0;
exdose=exdoseu*dose;
if exdose=. then exdose=0;
run;
data want (keep=id ecstdtc ecendtc ecstdy ecendy sum);
set have1 ;
by id;
array dosesum {30} _temporary_; /*Total doses, by day */
retain _maxecendy _ecstdtc1;
if first.id then do;
_maxecendy=1;
_ecstdtc1=ecstdtc;
do _d=1 to dim(dosesum);
dosesum{_d}=0;
end;
end;
_maxecendy=max(_maxecendy,ecendy); /*Track highest ECENDY*/
do _d=ecstdy to ecendy;
dosesum{_d}=dosesum{_d}+exdose;
end;
if last.id;
/*Now scan the daily totals, looking for change in dose*/
ecendy=0;
do until (ecendy=_maxecendy);
ecstdy=ecendy+1;
do ecendy=ecstdy to _maxecendy until(dosesum{ecendy+1}^=dosesum{ecstdy});
end;
ecstdtc=_ecstdtc1+ ecstdy-1;
ecendtc=_ecstdtc1+ ecendy-1;
sum=dosesum{ecstdy};
output;
end;
run;
The last record of your sample inline data originally specified day 10 through 10. This disagreed with the corresponding calendar dates, and with the EC table. So I changed the 10 to a 15 in my code above.
I want to be sure I understand the intended structure in the EX dataset. It appears to me that you want a chronological series of nonoverlapping records for a given id, such that each record has a constant dose level, and each record has a different dose level from the immediately preceding and immediately succeeding record.
I ask because at first, based on your EC data structure (which had a day 10 through day 10 record), I thought you might have a record for day 10 through 10, and then 11 through 15. But both of them would have a dose of 10, so you collapsed them into days 10 through 15, correct?
Hello, thank you so much for your response!
For "I want to be sure I understand the intended structure in the EX dataset. It appears to me that you want a chronological series of nonoverlapping records for a given id, such that each record has a constant dose level, and each record has a different dose level from the immediately preceding and immediately succeeding record." - Yes, it is correct. And for the day 10 through day 10 record, I apologize for this typo. It was supposed to be day 10 through day 15 as it shown in the picture.
First, thank you for providing sample data in a working DATA step, as well as your subsequent code.
I leave it to folks far more expert to find SQL code to do what you want - somehow it needs to track when there is a change in SUM and then determine the corresponding range of days and dates.
But I do think this DATA step code, which uses temporary arrays to hold accumulated doseage by day, will do what you want:
data have;
input id $ ectrt $8. dose
ecstdtc yymmdd10. +1
ecendtc yymmdd10.
ecstdy ecendy;
format
ecstdtc yymmdd10.
ecendtc yymmdd10.;
datalines;
abc4001 BOTTLE A 1 2011-01-14 2011-01-28 1 15
abc4001 BOTTLE C 1 2011-01-14 2011-01-28 1 15
abc4001 BOTTLE B 1 2011-01-14 2011-01-20 1 7
abc4001 BOTTLE B . 2011-01-21 2011-01-21 8 8
abc4001 BOTTLE B 2 2011-01-22 2011-01-22 9 9
abc4001 BOTTLE B 1 2011-01-23 2011-01-28 10 15 **replace 10 with 15**
;
run;
data have1;
set have;
if ectrt='BOTTLE A' then exdoseu=10;
else if ectrt='BOTTLE B' then exdoseu=10;
else if ectrt='BOTTLE C' then exdoseu=0;
exdose=exdoseu*dose;
if exdose=. then exdose=0;
run;
data want (keep=id ecstdtc ecendtc ecstdy ecendy sum);
set have1 ;
by id;
array dosesum {30} _temporary_; /*Total doses, by day */
retain _maxecendy _ecstdtc1;
if first.id then do;
_maxecendy=1;
_ecstdtc1=ecstdtc;
do _d=1 to dim(dosesum);
dosesum{_d}=0;
end;
end;
_maxecendy=max(_maxecendy,ecendy); /*Track highest ECENDY*/
do _d=ecstdy to ecendy;
dosesum{_d}=dosesum{_d}+exdose;
end;
if last.id;
/*Now scan the daily totals, looking for change in dose*/
ecendy=0;
do until (ecendy=_maxecendy);
ecstdy=ecendy+1;
do ecendy=ecstdy to _maxecendy until(dosesum{ecendy+1}^=dosesum{ecstdy});
end;
ecstdtc=_ecstdtc1+ ecstdy-1;
ecendtc=_ecstdtc1+ ecendy-1;
sum=dosesum{ecstdy};
output;
end;
run;
The last record of your sample inline data originally specified day 10 through 10. This disagreed with the corresponding calendar dates, and with the EC table. So I changed the 10 to a 15 in my code above.
Hello, thank you so much for your help! This solution is absolutely amazing. I've learned a lot from it. Thank you again!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.