BookmarkSubscribeRSS Feed
amanegm
Fluorite | Level 6

Hi All,

 

I am trying to use Proc Genmod to build Count Regression Model on Poisson. I have written the code as below:

 

proc genmod data = COUNT_data;
model count = KM /dist = poisson;
output out = outpt predicted = pred_val resdev = r_dev;
run;

 

Here I have tried to output the predicted values, deviance residual in variables pred_val, r_dev respectively in the output dataset -  outpt. Output of this proc executed is as below:

 

The SAS System

The GENMOD Procedure

Model Information

Data Set WORK.COUNT_DATA

Distribution Poisson

Link Function Log

Dependent Variable Count

Number of Observations Read 222

Number of Observations Used 222

Criteria For Assessing Goodness Of Fit

Criterion                              DF           Value                    Value/DF

Deviance                             220         170.1860              0.7736

Scaled Deviance               220         170.1860              0.7736

Pearson Chi-Square        220         199.7315              0.9079

Scaled Pearson X2            220         199.7315              0.9079

Log Likelihood                                   -91.0389  

Full Log Likelihood                           -371.0120

AIC (smaller is better)                    746.0240  

AICC (smaller is better)                  746.0788  

BIC (smaller is better)                    752.8294

Algorithm converged.

Analysis Of Maximum Likelihood Parameter Estimates

Parameter DF Estimate Standard Error    Wald 95% Confidence Limits Wald Chi-Square Pr > ChiSq

Intercept   1    0.5680    0.1108                   0.3508   0.7852                          26.27                           <.0001

KM              1    0.0000    0.0000                   0.0000   0.0000                          6.08                             0.0137

Scale           0    1.0000    0.0000                    1.0000  1.0000

 

I want to save the Deviance (170.1860 as in the above output) in some variable/dataset. How can it be done?

 

Also, How to calculate DFFITS? I want to find such observations where DFFITS > 2 * sqrt(2/n). I have seen that DFBETAS is available as an Output Statment option. On similar lines, is DFFITS available too?

4 REPLIES 4
Rick_SAS
SAS Super FREQ

I want to save the Deviance (170.1860 as in the above output) in some variable/dataset. How can it be done?

See the article "ODS OUTPUT: Store any statistic created by any SAS procedure"

 

>  How to calculate DFFITS?

The DFFITS option is not available in PROC GENMOD because that statistic assumes an identity link function. However, you can use the COOKDS statistic, which is very similar and provides similar information about the influence of each observation on the fit.

amanegm
Fluorite | Level 6
Thank you Rick. That was very helpful.
DFFITS is not available in Proc Genmod. But is it correct to use DFFITS in Proc Genmod, if I calculate it by some other means. I am using Log Link function.

Also, in R, if I build similar model as below:
l = glm(resp[[1]] ~ unlist(regr[[1]]) , family="poisson")

DFFITS function is available to compute that statistic.

So I am confused if it the right way?
Rick_SAS
SAS Super FREQ

I do not know the answer to your question. 

 

For OLS, DFFITS are closely related to the Studentized residual. When the errors are normally distributed, you can use that relationship to derive the distribution of the DFFITS statistic. Because you know the sampling distribution, you can use criteria such as DFFITS > 2 * sqrt(2/n) or 2 * sqrt(p/n) to find "extreme" values of the statistic.

 

Generalized linear models do not have Studentized residuals, they have other kinds of residuals (such as Pearson, deviance, or chi-square). You can compute the change in the deviance or chi-square that is attributed to deleting each observation, and this becomes a measure of influence. 

 

I was unable to find a textbook or journal article that explains how to generalize DFFITS to generalized linear regression models. Consequently, I recommend using one of the case-deletion statistics that PROC GENMOD provides, such as Cook's D. Perhaps an expert such as  @SteveDenham or @lvm can provide additional insight.

 

 

SteveDenham
Jade | Level 19

I'll back up @Rick_SAS on this one.  DFFITS is not appropriate for generalized linear models, as the studentized residual depends on the assumption of normality of errors.  Cook's D has been used fairly regularly to check for influential observations.

 

Steve Denham

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 3421 views
  • 3 likes
  • 3 in conversation