Dear Sir or Madam,
How are you?
I am writing to ask your advice on the adequacy between 2 Gamma models on log link from proc genmod. Is B better? Do I also have to look at the diagnostic plots? Thank you very much.
(1) One 5-level covariate(x)
Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Deviance 9147 8893.7611 0.9723
Scaled Deviance 9147 10383.5112 1.1352
Pearson Chi-Square 9147 25589.8917 2.7976
Scaled Pearson X2 9147 29876.3286 3.2662
Log Likelihood -48608.2903
Full Log Likelihood -48608.2903
AIC (smaller is better) 97228.5806
AICC (smaller is better) 97228.5898
BIC (smaller is better) 97271.3110
(2) Two 5-level covariates(x and y) and their interaction term
Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Deviance 9037 8296.5195 0.9181
Scaled Deviance 9037 10223.2934 1.1313
Pearson Chi-Square 9037 24705.1601 2.7338
Scaled Pearson X2 9037 30442.6575 3.3687
Log Likelihood -47833.7561
Full Log Likelihood -47833.7561
AIC (smaller is better) 95719.5123
AICC (smaller is better) 95719.6677
BIC (smaller is better) 95904.4202
The IC criteria all point to the second model (B) as being superior, while none of the other GoF parameters really point to any difference between the two. However, I would still examine the observed vs. predicted plot to check for any systematic bias.
Steve Denham
Hi @SteveDenham.
Thank you for your insight.
Can I please ask you why is the p-value from the second model (2) so different between the Deviance and the Scaled Deviance?
As far as I have understand, the p-value from the deviance(p=0.999999992) says the model fits the data reasonably well. But why is it 0 from the Scaled deviance. I don't understand.
I plotted the histogram of deviance residual, standardized deviance residual and likelihood residual from the model to check for normality and they do not indicate any departure from normality.
I further checked the actual value against the predicted value and it is hardly a straight line.
Any insight will be greatly appreciated.
4996 data _null_; p=1-probchi(8296.5195,9037); put p=; run;
p=0.9999999925
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
4997 data _null_; p=1-probchi(10223.2934,9037); put p=; run;
p=0
You can read about estimates of scale in the GENMOD documentation. You didn't show your code, so we don't know how the scale was estiamted, but with df=9037, there is bound to be a sizeable difference between the scaled and unscaled deviance.
The p-values are the probability of observing a random observation from the chisq(df=9037) distribution that is at least as great as 8297 (for the deviance) or 10223 (for the scaled deviance). You can look at the reference lines in the following plot to confirm that the computations are correct.
%let df = 9037;
data PDF;
do x = 8000 to 10300 by 20;
chi2 = pdf("chisquare", x, &df);
output;
end;
run;
proc sgplot data=pdf;
series x=x y=chi2;
refline 10223 8297 / axis=x;
run;
Hi @Rick_SAS. Thank you for your reply.
I have read but sadly I still don't understand.
Please find my code below if that would be helpful.
Thank you very much.
ods graphics on;
proc genmod data=temp plots=all;
class country raps;
model occamt=country raps country*raps / dist=gamma link=log type3;
lsmeans raps country country*raps / pdiff ilink adj=tukey;
contrast 'linear' raps -2 -1 0 1 2;
contrast 'quadratic' raps 2 -1 -2 -1 2 ;
contrast 'cubic' raps -1 2 0 -2 1;
output out=Residuals
pred=Pred resraw=Resraw reschi=Reschi resdev=Resdev
stdreschi=Stdreschi stdresdev=Stdresdev reslik=Reslik;
run;
proc univariate data=Residuals normal;
var Resraw Reschi Stdreschi Resdev Stdresdev Reslik;
histogram/ normal;
run;
proc gplot data=residuals; plot Resdev*Pred; run; quit;
proc gplot data=residuals; plot Stdresdev*Pred; run; quit;
proc gplot data=residuals; plot Reslik*Pred; run; quit;
proc gplot data=residuals; plot occamt*Pred; run; quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.