BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Paulet
Calcite | Level 5

Hello, 

 

I am wondering how to correct for over - and underdisperion in glimmix. Could someone help me. Thanks! 

 

Proc glimmix data = doc1;
ID Idn;
class diet strain;
model Thick = diet|strain / DDFM = KENWARDROGER;
random residual / subject = pen;
lsmeans diet|strain;
run;

 

The output is often:

 

Generalized Chi-SquareGener. Chi-Square / DF
454.29
8.11

or

Generalized Chi-SquareGener. Chi-Square / DF
6.44
0.11

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

So, for my understanding. In this case the Chi-square/df does not really say something with this model.


The opposite is true. It gives you some measure of dispersion of the data around the fitted line.

--
Paige Miller

View solution in original post

16 REPLIES 16
Paulet
Calcite | Level 5

I tried it as the statement:

 

Proc glimmix data = Anu.organday1_3;
ID Idn;
class diet strain;
model Gizz_Thick_rel = diet|strain / DDFM = KENWARDROGER;
random residual / subject = pen;
random_residual_ = Gizz_thick_rel;
lsmeans diet|strain;
run;

 

However, it stays 8.11

PaigeMiller
Diamond | Level 26

Can you please show the entire output from PROC GLIMMIX instead of just a few lines?

 

Please click on the {i} icon and paste the output, as text, into the window that appears. Do not skip this step.

--
Paige Miller
Paulet
Calcite | Level 5
Proc glimmix data = Anu.organweightday123_3; 
 ID Idn;
		 class diet strain; 
		 model Pancreas_rel = diet|strain / DDFM = KENWARDROGER; 
		 random residual / subject = pen;
		 lsmeans diet|strain;
		 run;

The SAS System 


The GLIMMIX Procedure

Model Information 
Data Set ANU.ORGANWEIGHTDAY123_3 
Response Variable Pancreas_rel 
Response Distribution Gaussian 
Link Function Identity 
Variance Function Default 
Variance Matrix Blocked By Pen 
Estimation Technique Restricted Maximum Likelihood 
Degrees of Freedom Method Kenward-Roger 
Fixed Effects SE Adjustment Kenward-Roger 



Class Level Information 
Class Levels Values 
diet 2 1 2 
strain 2 A B 



Number of Observations Read 196 
Number of Observations Used 60 



Dimensions 
R-side Cov. Parameters 1 
Columns in X 9 
Columns in Z per Subject 0 
Subjects (Blocks in V) 30 
Max Obs per Subject 3 



Optimization Information 
Optimization Technique None 
Parameters 0 
Lower Boundaries 0 
Upper Boundaries 0 
Fixed Effects Profiled 
Residual Variance Profiled 
Starting From Data 



Fit Statistics 
-2 Res Log Likelihood 280.34 
AIC (smaller is better) 282.34 
AICC (smaller is better) 282.42 
BIC (smaller is better) 283.74 
CAIC (smaller is better) 284.74 
HQIC (smaller is better) 282.79 
Generalized Chi-Square 403.67 
Gener. Chi-Square / DF 7.21 



Covariance Parameter Estimates 
Cov Parm Estimate Standard
Error 
Residual (VC) 7.2084 1.3623 



Type III Tests of Fixed Effects 
Effect Num DF Den DF F Value Pr > F 
diet 1 56 0.00 0.9502 
strain 1 56 1.49 0.2266 
diet*strain 1 56 1.02 0.3177 



diet Least Squares Means 
diet Estimate Standard
Error DF t Value Pr > |t| 
1 3.4287 0.4784 56 7.17 <.0001 
2 3.4724 0.5074 56 6.84 <.0001 



strain Least Squares Means 
strain Estimate Standard
Error DF t Value Pr > |t| 
A 3.0242 0.5074 56 5.96 <.0001 
B 3.8768 0.4784 56 8.10 <.0001 



diet*strain Least Squares Means 
strain diet Estimate Standard
Error DF t Value Pr > |t| 
A 1 2.6508 0.7176 56 3.69 0.0005 
B 1 4.2065 0.6328 56 6.65 <.0001 
A 2 3.3976 0.7176 56 4.73 <.0001 
B 2 3.5472 0.7176 56 4.94 <.0001 


PaigeMiller
Diamond | Level 26

Hmmm ... okay, are you saying that this represents over-dispersion? Why do you say that?

 

 

--
Paige Miller
sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

As you ponder your answer to @PaigeMiller 's question, you can keep in mind that overdispersion is not an issue for the Gaussian distribution (which is what your model assumes). I think Paige may be trying to make you think about your understanding of the model and the output, and I am, maybe, giving you a hint.

 

We do consider the possibility of overdispersion for one-parameter distributions in the exponential family, like binomial and Poisson.

 

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

Let me also note that the reason that @PaigeMiller asks you for the actual code that you ran AND the actual output from that code is because the Community cannot assess your problem unless we have enough of the right details. If you look at your previous posts and compare them to your last post, you'll see that your model changes with each post. From our point of view, it is hard to hit a moving target. So thank you for taking Paige's advice.

 

Paulet
Calcite | Level 5
Well after reading of different articles and of watching different YouTube movies, I thought that the generalized chii square/df should be near 1.

And I showed you in this case a different response variable, indeed one with Gaussian distribution.

So, I have different values of genre chi square/df, but with the same model and all not near one.

But you are saying that with the Gaussian distribution, this value is not a problem?
PaigeMiller
Diamond | Level 26

Yes, for Gaussian distribution (which is the default fit by GLIMMIX), there is no such thing as overdispersion. That doesn't mean the model fits well, there can be other problems, but not overdispersion.

 

You see Gaussian distributions are fit in such a way that the mean and the variance are estimated from the data. In other distributions, such as the Poisson or exponential, the variance is known before the model fit, and when the variance is estimated from the model fit is not close to the known variance, then you have underdispersion or overdispersion (example: if you have a Poisson distribution, the variance must be equal to the mean).

--
Paige Miller
Paulet
Calcite | Level 5
Thank you for the explanation.

However, the question arises. What are the other problems or is it better to use an entire different model, like proc mixed.
PaigeMiller
Diamond | Level 26

As far as I know, for the Gaussian case, PROC MIXED and PROC GLIMMIX should produce the same results for the same model.

 

Other problems: poor model fit, which can happen even in non-Gaussian cases with no overdispersion.

 

But its not clear why you don't like this model that you have fit, what is wrong with it?

--
Paige Miller
Paulet
Calcite | Level 5

So, for my understanding. In this case the Chi-square/df does not really say something with this model. However, if you would use the poisson distribution. Like for instance this example. Only in this case, the chi-square/df says something? Is there an article about this? 

The SAS System 


The GLIMMIX Procedure

Model Information 
Data Set ANU.ORGANDAY1_3 
Response Variable Progizz_score 
Response Distribution Poisson 
Link Function Log 
Variance Function Default 
Variance Matrix Blocked By Pen 
Estimation Technique Residual PL 
Degrees of Freedom Method Kenward-Roger 
Fixed Effects SE Adjustment Kenward-Roger 



Class Level Information 
Class Levels Values 
diet 2 1 2 
strain 2 A B 



Number of Observations Read 60 
Number of Observations Used 60 



Dimensions 
R-side Cov. Parameters 1 
Columns in X 9 
Columns in Z per Subject 0 
Subjects (Blocks in V) 30 
Max Obs per Subject 3 



Optimization Information 
Optimization Technique None 
Parameters 0 
Lower Boundaries 0 
Upper Boundaries 0 
Fixed Effects Profiled 
Residual Variance Profiled 
Starting From Data 



Iteration History 
Iteration Restarts Subiterations Objective
Function Change Max
Gradient 
0 0 0 -12.50422765 0.30362060 . 
1 0 0 -9.527888119 0.00779297 . 
2 0 0 -9.480642908 0.00000395 . 
3 0 0 -9.480626779 0.00000000 . 



Convergence criterion (PCONV=1.11022E-8) satisfied. 



Fit Statistics 
-2 Res Log Pseudo-Likelihood -9.48 
Generalized Chi-Square 7.82 
Gener. Chi-Square / DF 0.14 



Covariance Parameter Estimates 
Cov Parm Estimate Standard
Error 
Residual (VC) 0.1397 0.02639 



Type III Tests of Fixed Effects 
Effect Num DF Den DF F Value Pr > F 
diet 1 56 3.88 0.0539 
strain 1 56 1.64 0.2059 
diet*strain 1 56 0.81 0.3705 



diet Least Squares Means 
diet Estimate Standard
Error DF t Value Pr > |t| 
1 1.1795 0.03708 56 31.81 <.0001 
2 1.2829 0.03719 56 34.50 <.0001 



strain Least Squares Means 
strain Estimate Standard
Error DF t Value Pr > |t| 
A 1.1976 0.03886 56 30.82 <.0001 
B 1.2648 0.03532 56 35.81 <.0001 



diet*strain Least Squares Means 
strain diet Estimate Standard
Error DF t Value Pr > |t| 
A 1 1.1221 0.05699 56 19.69 <.0001 
B 1 1.2368 0.04746 56 26.06 <.0001 
A 2 1.2730 0.05285 56 24.09 <.0001 
B 2 1.2928 0.05233 56 24.70 <.0001 

 

PaigeMiller
Diamond | Level 26

Maybe you could answer my questions that you haven't yet answered. Specifically:

 

"are you saying that this represents over-dispersion? Why do you say that?"

 

"its not clear why you don't like this model that you have fit, what is wrong with it?"

 

 

--
Paige Miller
Paulet
Calcite | Level 5

Maybe just a misunderstanding, because i thought with the other post that you were saying that there was no overdispersion, but maybe that are different problems. So, I thought you meant that the model did not look good. 

 

I am content with the model, but I was just confused by the high value of the chi square/df

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 16 replies
  • 1880 views
  • 1 like
  • 3 in conversation