Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- How Do I Combine Multiple Dose Groups to Compare Against Placebo

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 03-28-2017 03:59 PM
(2283 views)

Dear All,

Clinical trials designed with multiple doses and a placebo group sometimes want to have an estimate of the combined dose group effect compared against placebo at the specified endpoint (eg, Week 8). Essentially, I am wondering if it is better to pool the dose groups prior to running the model or if the dose groups should be pooled in the contrast statement itself. I have provided example code below. I cannot find documentation regarding what the difference is between the two methods and when it is appropriate to use either method. I am working in SAS v9.4.

```
data test;
call streaminit(33445);
do id=1 to 20;
rid=rand('normal');
trt=ceil(rand('uniform')*3);
if trt in (2,3) then trt2=2;
else trt2=trt;
do time=1 to 2;
y=trt + trt*time + rand('normal') + rid;
output;
end;
end;
run;
proc mixed data=test;
class id trt time;
model y=trt time trt*time / e;
repeated time / subject=id(trt) type=cs;
contrast 'placebo vs active at timepoint 2' trt -1 .5 .5 trt*time 0 -1 0 .5 0 .5;
estimate 'placebo vs active at timepoint 2' trt -1 .5 .5 trt*time 0 -1 0 .5 0 .5;
lsmeans trt*time / diff;
run;
proc mixed data=test;
class id trt2 time;
model y=trt2 time trt2*time;
repeated time / subject=id(trt2) type=cs;
lsmeans trt2*time / diff;
estimate 'placebo vs active at timepoint 2' trt2 -1 1 trt2*time 0 -1 0 1;
run;
```

Here are the results using trt in model:

Standard

Label Estimate Error DF t Value Pr > |t|

placebo vs active at timepoint 2 5.7302 0.7368 17 7.78 <.0001

Here are the results using trt2 in model:

Standard

Label Estimate Error DF t Value Pr > |t|

placebo vs active at timepoint 2 5.9023 1.2494 18 4.72 0.0002

Many thanks in advance!!

6 REPLIES 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

A well-posed question 🙂

First, create a balanced data set so that you aren't trying to juggle the impacts of unbalanced data while you sort out syntax.

data newtest; call streaminit(33445); do id=1 to 10; rid=rand('normal'); *random effect for subject=id; do trt= 1 to 3; if trt in (2,3) then trt2=2; else trt2=trt; do time=1 to 2; y=trt + trt*time + rand('normal') + rid; output; end; end; end; run;

proc tabulate data=newtest;

class trt trt2;

table trt, trt2;

run;

Then run your two models. Note that the estimates of the difference now match, but SEs and DFs do not.

The fundamental difference in the two models lies in the REPEATED statement. The first model using

repeated time / subject=id(trt) type=cs;

identifies 30 subjects (10 IDs for each of 3 TRTs). But the REPEATED statement in the second model using

repeated time / subject=id(trt2) type=cs;

identifies only 20 subjects (10 IDs for each of 2 TRT2s). Consequently SEs and DFs differ.

If my experiment randomly assigned 3 treatments to 10 subjects per treatment so that I actually had 30 subjects in total, I would use the first model rather than the second because the first model preserves the experimental design; the second makes up a new one.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi All,

Thank you for the quick response.

I don't think I previously included the hypothesis of interest: Is there is a significant difference between combined groups 2 and 3 versus 1?

I realized that the IDs were identical within the treatment groups so when I combined the two treatment groups it assumed that certain subjects had multiple assessments at each time point (ie, that there were only 10 subjects in the newly created treatment group and therefore only 20 subjects total). I have updated my code to make the subject IDs unique. This experimental design assumes 10 subjects are randomized to 3 treatment groups (ie, 30 subjects total). If I am interested in comparing two pooled groups versus one group I am wondering how the interpretation between the two following models differs? The LSMD estimate is the same, but the SEs differ. I am wondering how to understand the difference between these two models.

My gut is to use the estimate statement because that follows the experimental design, but I am wondering if there is another reason beyond that or if I should use the pooled treatment groups variable instead?

```
data newtest;
call streaminit(33445);
do id=1 to 10;
rid=rand('normal'); *random effect for subject=id;
do trt= 1 to 3;
if trt in (2,3) then trt2=2;
else trt2=trt;
do time=1 to 2;
y=trt + trt*time + rand('normal') + rid;
output;
end;
end;
end;
run;
data newtest;
set newtest;
id = id * trt + (11*trt);
run;
proc mixed data=newtest method=reml;
class id trt time;
model y = trt time trt*time/ s ddfm=kr covb;
repeated time/ type=un subject=id(trt);
lsmeans trt*time / diff;
estimate 'test1' trt 1 -0.5 -0.5
trt * time 0 1
0 -.5
0 -0.5 /e;
run;
proc mixed data=newtest method=reml;
class id trt2 time;
model y = trt2 time trt2*time/ s ddfm=kr covb;
repeated time/ type=un subject=id(trt2);
lsmeans trt2*time / diff e;
run;
```

The results I get follow:

The first model

Estimates

Standard

Label Estimate Error DF t Value Pr > |t|

test1 -4.7338 0.5992 27 -7.90 <.0001

The second model

Differences of Least Squares Means

Standard

Effect TRT2 TIME _TRT2 _TIME Estimate Error DF t Value Pr > |t|

TRT2*TIME 1 1 1 2 -1.0561 0.4529 28 -2.33 0.0271

TRT2*TIME 1 1 2 1 -3.7173 0.7552 28 -4.92 <.0001

TRT2*TIME 1 1 2 2 -5.7899 0.7797 35 -7.43 <.0001

TRT2*TIME 1 2 2 1 -2.6612 0.8034 34 -3.31 0.0022

TRT2*TIME 1 2 2 2 -4.7338 0.8265 28 -5.73 <.0001

TRT2*TIME 2 1 2 2 -2.0725 0.3203 28 -6.47 <.0001

Another part of the question is also what if you want to perform pairwise comparisons as an exploratory analysis. Would you want to use contrast statements to obtain those LSMDs or would you run the model using only the subjects in the treatment groups of interest? In this case again, one gets the same LSMD estimate but the SE and DF are different.

```
proc mixed data=newtest method=reml;
class id trt time;
model y = trt time trt*time/ s ddfm=kr ;
repeated time/ type=un subject=id(trt);
lsmeans trt*time / diff;
estimate 'test2' trt 0 1 -1
trt * time 0 0
0 1
0 -1 /e;
run;
proc mixed data=newtest method=reml;
where trt in (2 3);
class id trt time;
model y = trt time trt*time/ s ddfm=kr ;
repeated time/ type=un subject=id(trt);
lsmeans trt*time / diff;
run;
```

The output from the estimate statement (model 1):

Estimates

Standard

Label Estimate Error DF t Value Pr > |t|

test2 -3.5463 0.6919 27 -5.13 <.0001

The output from the subset model (model 2):

Differences of Least Squares Means

Standard

Effect TRT TIME _TRT _TIME Estimate Error DF t Value Pr > |t|

TRT*TIME 2 1 2 2 -1.5477 0.3855 18 -4.02 0.0008

TRT*TIME 2 1 3 1 -2.4967 0.6730 18 -3.71 0.0016

TRT*TIME 2 1 3 2 -5.0940 0.7234 23.5 -7.04 <.0001

TRT*TIME 2 2 3 1 -0.9489 0.7234 23.5 -1.31 0.2023

TRT*TIME 2 2 3 2 -3.5463 0.7705 18 -4.60 0.0002

TRT*TIME 3 1 3 2 -2.5973 0.3855 18 -6.74 <.0001

I greatly appreciate everyone's insight.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

1. Use the ESTIMATE statement. The LSMESTIMATE statement is a great feature that makes writing contrasts even easier; check it out in the documentation or see

CONTRAST and ESTIMATE Statements Made Easy: The LSMESTIMATE Statement

2. Use ESTIMATE, CONTRAST, or LSMESTIMATE.

You could also take advantage of the SLICE option on the LSMEANS statement which estimates simple effects and saves you the effort of writing contrasts. The GLIMMIX procedure offers the SLICEDIFF option; check it out.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you for the quick response.

In addition to the ways the means can be estimated I am wondering what the interpretative difference between a model that has a three level treatment group and creating a contrast that 'averages the cell means' and a model that has a two level treatment group. I understand that the point estimates are the same, but the SEs and DFs are different so I am trying to understand the difference between these two methods. Which model is best posed to answer my question of "Is there a difference in group 2 and 3 versus 1?"

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Your experimental design involved subjects assigned to three treatment groups, not subjects assigned to two treatment groups. The experimental design determines the statistical model. Post-hoc redefinition of experimental treatments is hardly ever (even never?) a good idea.

In my opinion, the appropriate model specifies three treatment groups with a contrast to compare the mean of groups 2 and 3 to the mean of group 1.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you for your response. This was my thinking as well.

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.