Hello,
I am wondering how to correct for over - and underdisperion in glimmix. Could someone help me. Thanks!
Proc glimmix data = doc1;
ID Idn;
class diet strain;
model Thick = diet|strain / DDFM = KENWARDROGER;
random residual / subject = pen;
lsmeans diet|strain;
run;
The output is often:
454.29 |
8.11 |
or
6.44 |
0.11 |
So, for my understanding. In this case the Chi-square/df does not really say something with this model.
The opposite is true. It gives you some measure of dispersion of the data around the fitted line.
Does this help?
I tried it as the statement:
Proc glimmix data = Anu.organday1_3;
ID Idn;
class diet strain;
model Gizz_Thick_rel = diet|strain / DDFM = KENWARDROGER;
random residual / subject = pen;
random_residual_ = Gizz_thick_rel;
lsmeans diet|strain;
run;
However, it stays 8.11
Can you please show the entire output from PROC GLIMMIX instead of just a few lines?
Please click on the {i} icon and paste the output, as text, into the window that appears. Do not skip this step.
Proc glimmix data = Anu.organweightday123_3; ID Idn; class diet strain; model Pancreas_rel = diet|strain / DDFM = KENWARDROGER; random residual / subject = pen; lsmeans diet|strain; run;
The SAS System The GLIMMIX Procedure Model Information Data Set ANU.ORGANWEIGHTDAY123_3 Response Variable Pancreas_rel Response Distribution Gaussian Link Function Identity Variance Function Default Variance Matrix Blocked By Pen Estimation Technique Restricted Maximum Likelihood Degrees of Freedom Method Kenward-Roger Fixed Effects SE Adjustment Kenward-Roger Class Level Information Class Levels Values diet 2 1 2 strain 2 A B Number of Observations Read 196 Number of Observations Used 60 Dimensions R-side Cov. Parameters 1 Columns in X 9 Columns in Z per Subject 0 Subjects (Blocks in V) 30 Max Obs per Subject 3 Optimization Information Optimization Technique None Parameters 0 Lower Boundaries 0 Upper Boundaries 0 Fixed Effects Profiled Residual Variance Profiled Starting From Data Fit Statistics -2 Res Log Likelihood 280.34 AIC (smaller is better) 282.34 AICC (smaller is better) 282.42 BIC (smaller is better) 283.74 CAIC (smaller is better) 284.74 HQIC (smaller is better) 282.79 Generalized Chi-Square 403.67 Gener. Chi-Square / DF 7.21 Covariance Parameter Estimates Cov Parm Estimate Standard Error Residual (VC) 7.2084 1.3623 Type III Tests of Fixed Effects Effect Num DF Den DF F Value Pr > F diet 1 56 0.00 0.9502 strain 1 56 1.49 0.2266 diet*strain 1 56 1.02 0.3177 diet Least Squares Means diet Estimate Standard Error DF t Value Pr > |t| 1 3.4287 0.4784 56 7.17 <.0001 2 3.4724 0.5074 56 6.84 <.0001 strain Least Squares Means strain Estimate Standard Error DF t Value Pr > |t| A 3.0242 0.5074 56 5.96 <.0001 B 3.8768 0.4784 56 8.10 <.0001 diet*strain Least Squares Means strain diet Estimate Standard Error DF t Value Pr > |t| A 1 2.6508 0.7176 56 3.69 0.0005 B 1 4.2065 0.6328 56 6.65 <.0001 A 2 3.3976 0.7176 56 4.73 <.0001 B 2 3.5472 0.7176 56 4.94 <.0001
Hmmm ... okay, are you saying that this represents over-dispersion? Why do you say that?
As you ponder your answer to @PaigeMiller 's question, you can keep in mind that overdispersion is not an issue for the Gaussian distribution (which is what your model assumes). I think Paige may be trying to make you think about your understanding of the model and the output, and I am, maybe, giving you a hint.
We do consider the possibility of overdispersion for one-parameter distributions in the exponential family, like binomial and Poisson.
Let me also note that the reason that @PaigeMiller asks you for the actual code that you ran AND the actual output from that code is because the Community cannot assess your problem unless we have enough of the right details. If you look at your previous posts and compare them to your last post, you'll see that your model changes with each post. From our point of view, it is hard to hit a moving target. So thank you for taking Paige's advice.
Yes, for Gaussian distribution (which is the default fit by GLIMMIX), there is no such thing as overdispersion. That doesn't mean the model fits well, there can be other problems, but not overdispersion.
You see Gaussian distributions are fit in such a way that the mean and the variance are estimated from the data. In other distributions, such as the Poisson or exponential, the variance is known before the model fit, and when the variance is estimated from the model fit is not close to the known variance, then you have underdispersion or overdispersion (example: if you have a Poisson distribution, the variance must be equal to the mean).
As far as I know, for the Gaussian case, PROC MIXED and PROC GLIMMIX should produce the same results for the same model.
Other problems: poor model fit, which can happen even in non-Gaussian cases with no overdispersion.
But its not clear why you don't like this model that you have fit, what is wrong with it?
So, for my understanding. In this case the Chi-square/df does not really say something with this model. However, if you would use the poisson distribution. Like for instance this example. Only in this case, the chi-square/df says something? Is there an article about this?
The SAS System The GLIMMIX Procedure Model Information Data Set ANU.ORGANDAY1_3 Response Variable Progizz_score Response Distribution Poisson Link Function Log Variance Function Default Variance Matrix Blocked By Pen Estimation Technique Residual PL Degrees of Freedom Method Kenward-Roger Fixed Effects SE Adjustment Kenward-Roger Class Level Information Class Levels Values diet 2 1 2 strain 2 A B Number of Observations Read 60 Number of Observations Used 60 Dimensions R-side Cov. Parameters 1 Columns in X 9 Columns in Z per Subject 0 Subjects (Blocks in V) 30 Max Obs per Subject 3 Optimization Information Optimization Technique None Parameters 0 Lower Boundaries 0 Upper Boundaries 0 Fixed Effects Profiled Residual Variance Profiled Starting From Data Iteration History Iteration Restarts Subiterations Objective Function Change Max Gradient 0 0 0 -12.50422765 0.30362060 . 1 0 0 -9.527888119 0.00779297 . 2 0 0 -9.480642908 0.00000395 . 3 0 0 -9.480626779 0.00000000 . Convergence criterion (PCONV=1.11022E-8) satisfied. Fit Statistics -2 Res Log Pseudo-Likelihood -9.48 Generalized Chi-Square 7.82 Gener. Chi-Square / DF 0.14 Covariance Parameter Estimates Cov Parm Estimate Standard Error Residual (VC) 0.1397 0.02639 Type III Tests of Fixed Effects Effect Num DF Den DF F Value Pr > F diet 1 56 3.88 0.0539 strain 1 56 1.64 0.2059 diet*strain 1 56 0.81 0.3705 diet Least Squares Means diet Estimate Standard Error DF t Value Pr > |t| 1 1.1795 0.03708 56 31.81 <.0001 2 1.2829 0.03719 56 34.50 <.0001 strain Least Squares Means strain Estimate Standard Error DF t Value Pr > |t| A 1.1976 0.03886 56 30.82 <.0001 B 1.2648 0.03532 56 35.81 <.0001 diet*strain Least Squares Means strain diet Estimate Standard Error DF t Value Pr > |t| A 1 1.1221 0.05699 56 19.69 <.0001 B 1 1.2368 0.04746 56 26.06 <.0001 A 2 1.2730 0.05285 56 24.09 <.0001 B 2 1.2928 0.05233 56 24.70 <.0001
Maybe you could answer my questions that you haven't yet answered. Specifically:
"are you saying that this represents over-dispersion? Why do you say that?"
"its not clear why you don't like this model that you have fit, what is wrong with it?"
Maybe just a misunderstanding, because i thought with the other post that you were saying that there was no overdispersion, but maybe that are different problems. So, I thought you meant that the model did not look good.
I am content with the model, but I was just confused by the high value of the chi square/df
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.