Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Mixed modelling, help with correct procedure and code

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 07-20-2018 08:39 PM
(1855 views)

I would appreciate help with how to code the following experiment. I'm using SAS 9.3.

The experiment used blood from six donors.

The blood from each donor was divided into four culture wells, and each well incubated with one of a combination of two drugs (A and B), each drug either absent (i.e., diluent only; coded 0) or present at a predetermined concentration (coded 1). Only blood from one donor can be run at a given time.

I've approached this treatment phase as the full factorial of two drugs (A|B) but could see doing it as one treatment factor (trt) with four levels (trt0=A0B0, trt1=A1B0, trt2=A0B1, trt3=A1B1). Is there an advantage of one versus the other?

The activation state (%activated) of two types of T-lymphocyte (T-type = CD4 or CD8) is determined from each well.

The main interest is the effect of the treatment conditions on %activation within each T-cell type (all comparisons). It is of secondary interest to know if the treatment effects differ between the two types of T-cell.

Is MIXED or GLIMMIX more appropriate?

Does subject=donor or T-type(donor) or should it be subject=donor and group=T-type?

Should T-type be in the model statement?

Donor is random but what about T-type? Should T-type be in the model statement?

Do I need to have a repeated statement?

Here is something that works but I fear is far too simplistic:

```
proc glimmix data=work.CD4CD8 ;
class donor T_type A B ;
model activation = T_type|A|B ;
random _residual_ / subject=donor group=T_type ;
run;
```

Any advice or direction greatly appreciated.

Dave

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The design is definitely a mixed model. The MIXED procedure assumes that the response is normally distributed (conditional on the predictors); GLIMMIX allows other distributional assumptions, among them normal, beta and binomial.

Generally speaking, but there are always exceptions, a percent (or proportion) response uses either a beta or a binomial distribution, so GLIMMIX is typically more appropriate than MIXED. A proportion that is obtained as a ratio of counts (e.g., number of "successes" out of number of "trials") calls for a binomial distribution; a proportion measured directly calls for a beta distribution, as noted by @plf515.

If I have a two-way (here, 2x2) factorial, I usually specify the model as A x B rather than a single factor with 4 levels. But the statistical model is the same either way; it's just a different parameterization, and one form may deliver what you want more directly than the other. You get to choose.

I would consider the following model AS A STARTING POINT (for a beta distribution; syntax would differ for binomial):

```
proc glimmix data= work.cd4cd8 ;
class donor a b t_type;
model activation = a | b | t_type / dist=beta;
random intercept a*b / subject=donor;
lsmeans a | b | t_type / ilink;
run;
```

Keep in mind that there are a lot of options that might be better than the default options implied in the code above. Default options are not always the best choice for generalized linear mixed models.

You may want to start with a normal distribution assumption within GLIMMIX before attempting more challenging models, even though a normal distribution is probably not a valid choice. It's good to get your feet under you with *general* linear mixed models before you dive into *generalized* linear mixed models.

I hope this helps.

9 REPLIES 9

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you. I wondered about that. Yes, in this case this is bounded, values can only be 0 to 100. I'll see how this affects the modeling but it seems correct regardless.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The design is definitely a mixed model. The MIXED procedure assumes that the response is normally distributed (conditional on the predictors); GLIMMIX allows other distributional assumptions, among them normal, beta and binomial.

Generally speaking, but there are always exceptions, a percent (or proportion) response uses either a beta or a binomial distribution, so GLIMMIX is typically more appropriate than MIXED. A proportion that is obtained as a ratio of counts (e.g., number of "successes" out of number of "trials") calls for a binomial distribution; a proportion measured directly calls for a beta distribution, as noted by @plf515.

If I have a two-way (here, 2x2) factorial, I usually specify the model as A x B rather than a single factor with 4 levels. But the statistical model is the same either way; it's just a different parameterization, and one form may deliver what you want more directly than the other. You get to choose.

I would consider the following model AS A STARTING POINT (for a beta distribution; syntax would differ for binomial):

```
proc glimmix data= work.cd4cd8 ;
class donor a b t_type;
model activation = a | b | t_type / dist=beta;
random intercept a*b / subject=donor;
lsmeans a | b | t_type / ilink;
run;
```

Keep in mind that there are a lot of options that might be better than the default options implied in the code above. Default options are not always the best choice for generalized linear mixed models.

You may want to start with a normal distribution assumption within GLIMMIX before attempting more challenging models, even though a normal distribution is probably not a valid choice. It's good to get your feet under you with *general* linear mixed models before you dive into *generalized* linear mixed models.

I hope this helps.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks for all the description. That really helps me learn. I tried the beta distribution today and that definitely improved the fit stats. But your explanation makes me wonder if binomial is the correct distribution since it is the fraction of a population of cells identified as activated by flow cytometry. I'll try that tomorrow and post some code. Thanks for the reply!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Some "more friendly" (i.e., less mathematical) introductions into distributional choices:

For the beta distribution, A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables.

For binomial, etc., The arcsine is asinine: the analysis of proportions in ecology.

Looking forward to seeing what you find out....

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

So this is the solution I finally settled on.

```
proc glimmix data=work.CD4CD8 asycov plots=all;
class donor T_type A B ;
model activated = T_type|A|B / dist=binomial DDFM=KENWARDROGER ;
random _residual_ / subject=T_type(donor) type=un ;
lsmeans T_type*A*B / ilink slicediff=T_type adjust=tukey ADJDFE=ROW plots=meanplot(ilink join cl sliceby=T_type) ;
output out=gmxout pred(blup ilink)=pred;
run;
```

The output indicates very good fit. I really appreciate the input from this community.

Dave

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I'm glad you found the Community to be helpful.

Before we drop this topic, I want to follow up because I'm not sure that I am entirely happy with your model.

(1) The response variable is *activated *and the distribution is *binomial*. Does *activated* take values of 0 or 1, or is it a proportion?

(2) An important distinction between the normal distribution and distributions available in generalized linear (mixed) model procedures is that the variance is a function of the mean for these non-normal distributions. Consequently, although there are *residuals*, there is no such thing as *residual variance* because once the mean is estimated, the variance is also known.* *And consequently then, I do not think I would use

```
random _residual_ / subject=T_type(donor) type=un ;
```

at least, not without a lot of thought about what it is doing and whether it is valid.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The response is a proportion varying between 0 and 1. The fit stat Gen. Chi-Square / DF = 1.00.

Dave

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

If the response is a proportion, then I probably would use the beta distribution rather than the binomial.

The binomial distribution theoretically is appropriate either for binary (Bernoulli) data (taking values of 0 or 1) or for responses specified as "number of successes" out of "number of trials" using the syntax

MODEL events/trials = <fixed-effects> </ model-options>;

where "events" is number_of_activated cells and "trials" is total_number_of_cells.

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.