turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- ask for help with glimmix

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-21-2017 11:26 AM

Hi All,

I am a newbie to SAS and struggling with my data analysis. What I am trying to do is to compare the sporaniga production ability of two fungal isolates. In my experiment, I used one diseased plant infected by each isolate, and counted the number of sporangia around the root tips of each plant, and I did that for 3 times. And I am trying to use PROC GLIMMIX to analyze my data. I have attached part of my data and the a couple of sas program I tried below.

My specific questions are: 1) the dependent variable does not follow normal distribution, if I specified it follows poission distribution in proc glimmix, do I need to further transform the data? Which one works better, data transformation or specifying the poission distribution?

2) I am not quite sure what "random _residual_" does, does it necessary to be in the code? Because in one of the progams I tried, it fails to converge with the statment random _residual_.

3) Would you please comment on the several sas programs I tried to analyze my data? What should be done to modify the program?

Hope you can help me out. I'll really appreicate your help!

Thanks,

Jing

```
data sporangia_production_comparison;
input isolate$ run sporangia sqrtsporangia;
datalines;
R0-G5-6 1 60 8
R0-G5-6 1 54 7
R0-G5-6 1 52 7
R0-G5-6 1 51 7
R0-G5-6 1 50 7
R0-G5-6 1 42 6
R0-G5-6 1 42 6
R0-G5-6 1 39 6
R0-G5-6 1 37 6
R0-G5-6 1 32 6
R0-G5-6 1 30 5
R0-G5-6 1 24 5
R0-G5-6 1 23 5
R0-G5-6 1 23 5
R0-G5-6 1 23 5
R0-G5-6 1 20 4
R0-G5-6 1 18 4
R0-G5-6 1 16 4
R0-G5-6 1 15 4
R0-G5-6 1 11 3
R0-G5-6 1 8 3
R0-G5-6 1 5 2
R0-G5-6 1 4 2
R0-G5-6 1 4 2
R0-G5-6 1 3 2
R0-G5-6 1 3 2
R0-G5-6 1 0 0
R0-G5-6 1 0 0
R0-G5-6 1 0 0
R0-G5-6 2 89 9
R0-G5-6 2 75 9
R0-G5-6 2 70 8
R0-G5-6 2 69 8
R0-G5-6 2 65 8
R0-G5-6 2 39 6
R0-G5-6 2 37 6
R0-G5-6 2 34 6
R0-G5-6 2 33 6
R0-G5-6 2 27 5
R0-G5-6 2 26 5
R0-G5-6 2 25 5
R0-G5-6 2 22 5
R0-G5-6 2 19 4
R0-G5-6 2 18 4
R0-G5-6 2 17 4
R0-G5-6 2 16 4
R0-G5-6 2 9 3
R0-G5-6 2 8 3
R0-G5-6 2 5 2
R0-G5-6 2 5 2
R0-G5-6 2 5 2
R0-G5-6 2 4 2
R0-G5-6 2 4 2
R0-G5-6 2 3 2
R0-G5-6 2 2 1
R0-G5-6 2 2 1
R0-G5-6 2 2 1
R0-G5-6 2 1 1
R0-G5-6 2 1 1
R0-G5-6 2 0 0
R0-G5-6 2 0 0
R0-G5-6 2 0 0
R0-G5-6 2 0 0
R0-G5-6 2 0 0
R0-G5-6 2 0 0
R0-G5-6 2 0 0
R0-G5-6 2 0 0
R0-G5-6 2 0 0
R0-G5-6 2 0 0
R0-G5-6 2 0 0
R0-G5-6 2 0 0
R0-G5-6 2 0 0
R0-G5-6 2 0 0
R0-G5-6 3 103 10
R0-G5-6 3 95 10
R0-G5-6 3 89 9
R0-G5-6 3 87 9
R0-G5-6 3 78 9
R0-G5-6 3 75 9
R0-G5-6 3 71 8
R0-G5-6 3 59 8
R0-G5-6 3 55 7
R0-G5-6 3 40 6
R0-G5-6 3 36 6
R0-G5-6 3 35 6
R0-G5-6 3 34 6
R0-G5-6 3 33 6
R0-G5-6 3 31 6
R0-G5-6 3 23 5
R0-G5-6 3 23 5
R0-G5-6 3 19 4
R0-G5-6 3 19 4
R0-G5-6 3 17 4
R0-G5-6 3 16 4
R0-G5-6 3 15 4
R0-G5-6 3 14 4
R0-G5-6 3 14 4
R0-G5-6 3 12 3
R0-G5-6 3 12 3
R0-G5-6 3 6 2
R0-G5-6 3 5 2
R0-G5-6 3 3 2
R0-G5-6 3 2 1
R0-G5-6 3 2 1
R0-G5-6 3 2 1
R0-G5-6 3 2 1
R0-G5-6 3 0 0
R0-G5-6 3 0 0
R0-G5-6 3 0 0
R0-G2-6 1 21 5
R0-G2-6 1 21 5
R0-G2-6 1 21 5
R0-G2-6 1 17 4
R0-G2-6 1 15 4
R0-G2-6 1 15 4
R0-G2-6 1 14 4
R0-G2-6 1 14 4
R0-G2-6 1 13 4
R0-G2-6 1 11 3
R0-G2-6 1 9 3
R0-G2-6 1 9 3
R0-G2-6 1 8 3
R0-G2-6 1 8 3
R0-G2-6 1 6 2
R0-G2-6 1 5 2
R0-G2-6 1 4 2
R0-G2-6 1 3 2
R0-G2-6 1 2 1
R0-G2-6 1 2 1
R0-G2-6 1 2 1
R0-G2-6 1 1 1
R0-G2-6 1 0 0
R0-G2-6 1 0 0
R0-G2-6 1 0 0
R0-G2-6 1 0 0
R0-G2-6 1 0 0
R0-G2-6 1 0 0
R0-G2-6 1 0 0
R0-G2-6 1 0 0
R0-G2-6 1 0 0
R0-G2-6 1 0 0
R0-G2-6 1 0 0
R0-G2-6 1 0 0
R0-G2-6 2 100 10
R0-G2-6 2 73 9
R0-G2-6 2 63 8
R0-G2-6 2 61 8
R0-G2-6 2 43 7
R0-G2-6 2 33 6
R0-G2-6 2 33 6
R0-G2-6 2 30 5
R0-G2-6 2 29 5
R0-G2-6 2 29 5
R0-G2-6 2 28 5
R0-G2-6 2 27 5
R0-G2-6 2 26 5
R0-G2-6 2 25 5
R0-G2-6 2 24 5
R0-G2-6 2 15 4
R0-G2-6 2 15 4
R0-G2-6 2 14 4
R0-G2-6 2 13 4
R0-G2-6 2 13 4
R0-G2-6 2 12 3
R0-G2-6 2 10 3
R0-G2-6 2 9 3
R0-G2-6 2 9 3
R0-G2-6 2 6 2
R0-G2-6 2 6 2
R0-G2-6 2 5 2
R0-G2-6 2 4 2
R0-G2-6 2 2 1
R0-G2-6 2 1 1
R0-G2-6 2 0 0
R0-G2-6 2 0 0
R0-G2-6 2 0 0
R0-G2-6 2 0 0
R0-G2-6 2 0 0
R0-G2-6 2 0 0
R0-G2-6 2 0 0
R0-G2-6 2 0 0
R0-G2-6 2 0 0
R0-G2-6 2 0 0
R0-G2-6 2 0 0
R0-G2-6 2 0 0
R0-G2-6 3 53 7
R0-G2-6 3 52 7
R0-G2-6 3 45 7
R0-G2-6 3 37 6
R0-G2-6 3 34 6
R0-G2-6 3 28 5
R0-G2-6 3 26 5
R0-G2-6 3 25 5
R0-G2-6 3 23 5
R0-G2-6 3 22 5
R0-G2-6 3 19 4
R0-G2-6 3 19 4
R0-G2-6 3 17 4
R0-G2-6 3 16 4
R0-G2-6 3 16 4
R0-G2-6 3 13 4
R0-G2-6 3 13 4
R0-G2-6 3 11 3
R0-G2-6 3 11 3
R0-G2-6 3 10 3
R0-G2-6 3 8 3
R0-G2-6 3 8 3
R0-G2-6 3 7 3
R0-G2-6 3 6 2
R0-G2-6 3 5 2
R0-G2-6 3 5 2
R0-G2-6 3 4 2
R0-G2-6 3 4 2
R0-G2-6 3 4 2
R0-G2-6 3 3 2
R0-G2-6 3 3 2
R0-G2-6 3 3 2
R0-G2-6 3 3 2
R0-G2-6 3 2 1
R0-G2-6 3 2 1
R0-G2-6 3 1 1
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
R0-G2-6 3 0 0
;
/*Program 1*/
proc glimmix data=sporangia_production_comparison;
class isolate;
model sporangia= isolate / dist=poisson;
random run;
lsmeans isolate/lines;
run;
/*Program 2*/
/*Fails to converge*/
proc glimmix data=sporangia_production_comparison;
class isolate;
model sporangia= isolate /dist=poisson ;
random run;
random _residual_;
lsmeans isolate/lines;
run;
/*Program 3*/
proc glimmix data=sporangia_production_comparison;
class isolate;
model sqrtsporangia= isolate ;
random run;
lsmeans isolate/lines;
run;
/*Program 4*/
proc glimmix data=sporangia_production_comparison;
class isolate;
model sqrtsporangia= isolate ;
random run;
random _residual_;
lsmeans isolate/lines;
run;
```

Accepted Solutions

Solution

2 weeks ago

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2017 01:39 AM

Then a plant is your replicating (experimental) unit for the isolate factor, and roots are subsamples.

I would consider

proc glimmix data=spor method=laplace;

class run isolate;

model sporangia = isolate / dist=negbin;

random run(isolate);

lsmeans isolate / ilink;

run;

Because run takes values of (1,2,3) within each level of isolate, you need to tell GLIMMIX that you actually have (number of isolates) x (number of replicates) = (number of plants), which you can do either by providing a unique id for each plant or using

random run(isolate);

If you use

random run;

then GLIMMIX will think there are only 3 plants total.

I've used the negative binomial distribution here because, for your sample data, the Poisson distribution had problems with overdispersion. An alternative approach would be a quasi-Poisson; see

http://fisher.utstat.toronto.edu/reid/sta2201s/QUASI-POISSON.pdf

and

https://www.r-bloggers.com/estimating-quasi-poisson-regression-with-glimmix-in-sas/

I generally use method=laplace for GLMMs; but it doesn't always work well.

If you specify a non-normal distribution, then you will not use a transformed response variable. The link associated with the non-normal distribution does that job for you.

Theoretically, if data follow, say, a Poisson distribution, then a generalized linear model with the original count data would be preferable to a general linear model with a transformed response. In practice, it doesn't always work out that way, but that's usually how I start.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-22-2017 11:39 PM

Please clarify: How many plants were inoculated with each isolate? Was only one plant inoculated with each isolate and then observed 3 times? Or were 3 plants inoculated with each isolate and observed once? Did you count sporangia on multiple roots on each plant?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-24-2017 09:40 AM

Three plants were inoculated with each isolate, and I counted sporangia on each root tip of each inoculated plant. In the data I attached, the second column indicates which plant the root tips come from, and the third column indicated the number of sporangia around that specific root tip.

Solution

2 weeks ago

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2017 01:39 AM

Then a plant is your replicating (experimental) unit for the isolate factor, and roots are subsamples.

I would consider

proc glimmix data=spor method=laplace;

class run isolate;

model sporangia = isolate / dist=negbin;

random run(isolate);

lsmeans isolate / ilink;

run;

Because run takes values of (1,2,3) within each level of isolate, you need to tell GLIMMIX that you actually have (number of isolates) x (number of replicates) = (number of plants), which you can do either by providing a unique id for each plant or using

random run(isolate);

If you use

random run;

then GLIMMIX will think there are only 3 plants total.

I've used the negative binomial distribution here because, for your sample data, the Poisson distribution had problems with overdispersion. An alternative approach would be a quasi-Poisson; see

http://fisher.utstat.toronto.edu/reid/sta2201s/QUASI-POISSON.pdf

and

https://www.r-bloggers.com/estimating-quasi-poisson-regression-with-glimmix-in-sas/

I generally use method=laplace for GLMMs; but it doesn't always work well.

If you specify a non-normal distribution, then you will not use a transformed response variable. The link associated with the non-normal distribution does that job for you.

Theoretically, if data follow, say, a Poisson distribution, then a generalized linear model with the original count data would be preferable to a general linear model with a transformed response. In practice, it doesn't always work out that way, but that's usually how I start.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2017 11:38 AM

Thank you so much for your help!!!

I had several questions that I didn't quite understand.

1) when to use poisson and negtive binomial distrbutions? and when is better to do the data transformation?

2) what does method=laplace do? what's the default setting for method?

3) does ilink do the data transformation for me if I specify that the data do not follow normal distribution?

4) what does random _residual_ do? If I use random run(isolate), do I need to add random _residual_ in the sas code?

Sorry for the naive questions. Thanks for your time and kind attention! I really appreciated it!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2017 01:37 PM

1) You choose a distribution that is appropriate and provides a valid model that fits the data well. But, of course, not all data follow a known distribution, so identifying a model that is "good enough" can be a challenge. You have to be knowledgeable about your options, there are many texts on generalized linear models and modeling strategies that you can study, as well as courses for directed or self-study.

2) and 3) and 4) The documentation for GLIMMIX is a good place to start for the answer to these questions. Also

4) Did you look at this link in my previous message?

https://www.r-bloggers.com/estimating-quasi-poisson-regression-with-glimmix-in-sas/

It illustrates how to specify a quasi-Poisson distribution using random _residual_.

Generalized linear mixed models are complicated, can be difficult to fit, and can be challenging to interpret. You'll want to devote appreciable time to learning about them. You won't be able to learn everything you need to know through this forum. Seek out a statistician at your institution/company to work with directly, if one exists. (I know, they often do not.)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-30-2017 09:49 PM

Thanks for your great help and the useful tips! I'll check those links out. Really appreciated your kind attention!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-31-2017 10:03 AM

Hi, sorry to keep bugging you! But this question just came to my mind. I think the code you provided treated run as a fixed effect, is that correct?

If I wanted to treat run as a random effect, should I modify the SAS code to the one as follows:

proc glimmix data=spor method=laplace; class isolate; model sporangia = isolate / dist=negbin; random run run(isolate); lsmeans isolate / ilink; run;

Thanks,

Jing

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-31-2017 10:20 AM

Just realized I made a silly mistake. Thought variables that follow class statment are treated as fixed effects, that's not the case, they are just categorical variables. Sorry about the silly question.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-31-2017 11:52 PM

Glad you sorted it out.

The syntax can be confusing. Random effects factors never appear in the MODEL statement in GLIMMIX, only fixed effects factors. But we can use fixed effects factor *names* in the RANDOM statement to identify a full set of levels for a random effects factor; it's just a very handy syntax shortcut.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-28-2017 03:07 PM

Hi sld,

Sorry for keep bugging you. But I had this question for a while and I haven't been able to figure it out myself.

I was wondering why specifying negative binomial of the counting data would give negative estimates? Hope you could give me some hints. I will really appreciate it!

Thanks in advance!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-28-2017 04:08 PM

Estimates of parameters or estimates of lsmeans?

In either case, estimates are on the link scale. So, for example, the lsmean estimate may be negative but its inverse link (produced with the ILINK option) will be non-negative.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-30-2017 01:29 PM

Hi sld, thanks for your kind reply!

Yes, I was asking about estimates of lsmeans. I just had another question, if the default link function for "negbin" is "log", does specifying negative binomial distribution for counting data equal to log transformation?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-30-2017 04:26 PM

No, a log link is not the same as a log transformation. For more detail see

There is one exception: specifying a lognormal distribution produces the same results as applying a log transformation to y and then assuming normality. From the GLIMMIX documentation:

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-02-2017 09:39 AM

Hi sld, thank you so much for your kind reply!