Solved: Re: correcting/ adjusting for overdispersion and underdispersion

Paulet · Posted 10-26-2019 09:35 AM

Hello,

I am wondering how to correct for over - and underdisperion in glimmix. Could someone help me. Thanks!

Proc glimmix data = doc1;
ID Idn;
class diet strain;
model Thick = diet|strain / DDFM = KENWARDROGER;
random residual / subject = pen;
lsmeans diet|strain;
run;

The output is often:

Generalized Chi-SquareGener. Chi-Square / DF

454.29

8.11

or

Generalized Chi-SquareGener. Chi-Square / DF

6.44

0.11

PaigeMiller · Posted 10-27-2019 10:59 AM

So, for my understanding. In this case the Chi-square/df does not really say something with this model.

The opposite is true. It gives you some measure of dispersion of the data around the fitted line.

--
Paige Miller

View solution in original post

PaigeMiller · Posted 10-26-2019 09:46 AM

Does this help?

https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_glimmix_details05.htm%3Flocale&do...

--
Paige Miller

Paulet · Posted 10-26-2019 10:07 AM

I tried it as the statement:

Proc glimmix data = Anu.organday1_3;
ID Idn;
class diet strain;
model Gizz_Thick_rel = diet|strain / DDFM = KENWARDROGER;
random residual / subject = pen;
random_residual_ = Gizz_thick_rel;
lsmeans diet|strain;
run;

However, it stays 8.11

PaigeMiller · Posted 10-26-2019 11:17 AM

Can you please show the entire output from PROC GLIMMIX instead of just a few lines?

Please click on the {i} icon and paste the output, as text, into the window that appears. Do not skip this step.

--
Paige Miller

Paulet · Posted 10-26-2019 06:57 PM

Proc glimmix data = Anu.organweightday123_3; 
 ID Idn;
		 class diet strain; 
		 model Pancreas_rel = diet|strain / DDFM = KENWARDROGER; 
		 random residual / subject = pen;
		 lsmeans diet|strain;
		 run;

The SAS System 


The GLIMMIX Procedure

Model Information 
Data Set ANU.ORGANWEIGHTDAY123_3 
Response Variable Pancreas_rel 
Response Distribution Gaussian 
Link Function Identity 
Variance Function Default 
Variance Matrix Blocked By Pen 
Estimation Technique Restricted Maximum Likelihood 
Degrees of Freedom Method Kenward-Roger 
Fixed Effects SE Adjustment Kenward-Roger 



Class Level Information 
Class Levels Values 
diet 2 1 2 
strain 2 A B 



Number of Observations Read 196 
Number of Observations Used 60 



Dimensions 
R-side Cov. Parameters 1 
Columns in X 9 
Columns in Z per Subject 0 
Subjects (Blocks in V) 30 
Max Obs per Subject 3 



Optimization Information 
Optimization Technique None 
Parameters 0 
Lower Boundaries 0 
Upper Boundaries 0 
Fixed Effects Profiled 
Residual Variance Profiled 
Starting From Data 



Fit Statistics 
-2 Res Log Likelihood 280.34 
AIC (smaller is better) 282.34 
AICC (smaller is better) 282.42 
BIC (smaller is better) 283.74 
CAIC (smaller is better) 284.74 
HQIC (smaller is better) 282.79 
Generalized Chi-Square 403.67 
Gener. Chi-Square / DF 7.21 



Covariance Parameter Estimates 
Cov Parm Estimate Standard
Error 
Residual (VC) 7.2084 1.3623 



Type III Tests of Fixed Effects 
Effect Num DF Den DF F Value Pr > F 
diet 1 56 0.00 0.9502 
strain 1 56 1.49 0.2266 
diet*strain 1 56 1.02 0.3177 



diet Least Squares Means 
diet Estimate Standard
Error DF t Value Pr > |t| 
1 3.4287 0.4784 56 7.17 <.0001 
2 3.4724 0.5074 56 6.84 <.0001 



strain Least Squares Means 
strain Estimate Standard
Error DF t Value Pr > |t| 
A 3.0242 0.5074 56 5.96 <.0001 
B 3.8768 0.4784 56 8.10 <.0001 



diet*strain Least Squares Means 
strain diet Estimate Standard
Error DF t Value Pr > |t| 
A 1 2.6508 0.7176 56 3.69 0.0005 
B 1 4.2065 0.6328 56 6.65 <.0001 
A 2 3.3976 0.7176 56 4.73 <.0001 
B 2 3.5472 0.7176 56 4.94 <.0001

PaigeMiller · Posted 10-26-2019 07:15 PM

Hmmm ... okay, are you saying that this represents over-dispersion? Why do you say that?

--
Paige Miller

sld · Posted 10-26-2019 10:04 PM

As you ponder your answer to @PaigeMiller 's question, you can keep in mind that overdispersion is not an issue for the Gaussian distribution (which is what your model assumes). I think Paige may be trying to make you think about your understanding of the model and the output, and I am, maybe, giving you a hint.

We do consider the possibility of overdispersion for one-parameter distributions in the exponential family, like binomial and Poisson.

sld · Posted 10-26-2019 10:13 PM

Let me also note that the reason that @PaigeMiller asks you for the actual code that you ran AND the actual output from that code is because the Community cannot assess your problem unless we have enough of the right details. If you look at your previous posts and compare them to your last post, you'll see that your model changes with each post. From our point of view, it is hard to hit a moving target. So thank you for taking Paige's advice.

Paulet · Posted 10-27-2019 03:27 AM

Well after reading of different articles and of watching different YouTube movies, I thought that the generalized chii square/df should be near 1.

And I showed you in this case a different response variable, indeed one with Gaussian distribution.

So, I have different values of genre chi square/df, but with the same model and all not near one.

But you are saying that with the Gaussian distribution, this value is not a problem?

PaigeMiller · Posted 10-27-2019 06:48 AM

Yes, for Gaussian distribution (which is the default fit by GLIMMIX), there is no such thing as overdispersion. That doesn't mean the model fits well, there can be other problems, but not overdispersion.

You see Gaussian distributions are fit in such a way that the mean and the variance are estimated from the data. In other distributions, such as the Poisson or exponential, the variance is known before the model fit, and when the variance is estimated from the model fit is not close to the known variance, then you have underdispersion or overdispersion (example: if you have a Poisson distribution, the variance must be equal to the mean).

--
Paige Miller

Paulet · Posted 10-27-2019 08:14 AM

Thank you for the explanation.

However, the question arises. What are the other problems or is it better to use an entire different model, like proc mixed.

PaigeMiller · Posted 10-27-2019 08:51 AM

As far as I know, for the Gaussian case, PROC MIXED and PROC GLIMMIX should produce the same results for the same model.

Other problems: poor model fit, which can happen even in non-Gaussian cases with no overdispersion.

But its not clear why you don't like this model that you have fit, what is wrong with it?

--
Paige Miller

Paulet · Posted 10-27-2019 09:59 AM

So, for my understanding. In this case the Chi-square/df does not really say something with this model. However, if you would use the poisson distribution. Like for instance this example. Only in this case, the chi-square/df says something? Is there an article about this?

The SAS System 


The GLIMMIX Procedure

Model Information 
Data Set ANU.ORGANDAY1_3 
Response Variable Progizz_score 
Response Distribution Poisson 
Link Function Log 
Variance Function Default 
Variance Matrix Blocked By Pen 
Estimation Technique Residual PL 
Degrees of Freedom Method Kenward-Roger 
Fixed Effects SE Adjustment Kenward-Roger 



Class Level Information 
Class Levels Values 
diet 2 1 2 
strain 2 A B 



Number of Observations Read 60 
Number of Observations Used 60 



Dimensions 
R-side Cov. Parameters 1 
Columns in X 9 
Columns in Z per Subject 0 
Subjects (Blocks in V) 30 
Max Obs per Subject 3 



Optimization Information 
Optimization Technique None 
Parameters 0 
Lower Boundaries 0 
Upper Boundaries 0 
Fixed Effects Profiled 
Residual Variance Profiled 
Starting From Data 



Iteration History 
Iteration Restarts Subiterations Objective
Function Change Max
Gradient 
0 0 0 -12.50422765 0.30362060 . 
1 0 0 -9.527888119 0.00779297 . 
2 0 0 -9.480642908 0.00000395 . 
3 0 0 -9.480626779 0.00000000 . 



Convergence criterion (PCONV=1.11022E-8) satisfied. 



Fit Statistics 
-2 Res Log Pseudo-Likelihood -9.48 
Generalized Chi-Square 7.82 
Gener. Chi-Square / DF 0.14 



Covariance Parameter Estimates 
Cov Parm Estimate Standard
Error 
Residual (VC) 0.1397 0.02639 



Type III Tests of Fixed Effects 
Effect Num DF Den DF F Value Pr > F 
diet 1 56 3.88 0.0539 
strain 1 56 1.64 0.2059 
diet*strain 1 56 0.81 0.3705 



diet Least Squares Means 
diet Estimate Standard
Error DF t Value Pr > |t| 
1 1.1795 0.03708 56 31.81 <.0001 
2 1.2829 0.03719 56 34.50 <.0001 



strain Least Squares Means 
strain Estimate Standard
Error DF t Value Pr > |t| 
A 1.1976 0.03886 56 30.82 <.0001 
B 1.2648 0.03532 56 35.81 <.0001 



diet*strain Least Squares Means 
strain diet Estimate Standard
Error DF t Value Pr > |t| 
A 1 1.1221 0.05699 56 19.69 <.0001 
B 1 1.2368 0.04746 56 26.06 <.0001 
A 2 1.2730 0.05285 56 24.09 <.0001 
B 2 1.2928 0.05233 56 24.70 <.0001

PaigeMiller · Posted 10-27-2019 10:51 AM

Maybe you could answer my questions that you haven't yet answered. Specifically:

"are you saying that this represents over-dispersion? Why do you say that?"

"its not clear why you don't like this model that you have fit, what is wrong with it?"

--
Paige Miller

Paulet · Posted 10-27-2019 11:02 AM

Maybe just a misunderstanding, because i thought with the other post that you were saying that there was no overdispersion, but maybe that are different problems. So, I thought you meant that the model did not look good.

I am content with the model, but I was just confused by the high value of the chi square/df

SAS Innovate 2025: Save the Date