BookmarkSubscribeRSS Feed
factorhedge
Fluorite | Level 6

I am using a financial panel data , I would like to run regression using Proc genmod , clustering by firm. I am surprised the out put does not provide Adjusted R square when the y variable is not binary. Could somebody help how to get adjusted R squared when I do regression using PROC GENMOD .

5 REPLIES 5
SteveDenham
Jade | Level 19

PROC GENMOD uses maximum likelihood methods rather than ordinary least squares to obtain estimates.  Consequently, adjusted R squared as it is usually thought of is not (and really cannot be) reported.  Further, I wonder how you "cluster" by firm in a procedure that does not accommodate hierarchical clustering--that would be the domain of PROC GLIMMIX.  But I could be misunderstanding the terminology of the particular field.

Anyway, Google is always your friend.  It tells me that scaled deviance and Pearson's chi square are the usual methods for generalized linear models, including those with a normal distribution.

Steve Denham

pink_poodle
Barite | Level 11

@SteveDenham ,

Is my generalized linear model with normal probability distribution and an identity link function a good fit?:

Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Scaled Deviance 2153 10351789298 4808076.7757
Pearson Chi-Square 2153 10351789298 4808076.7757
Scaled Pearson X2 2153 10351789298 4808076.7757
Log Likelihood   -5175896643  
Full Log Likelihood   -5175896643  
AIC (smaller is better)   10351793320  
AICC (smaller is better)   10351793321  
BIC (smaller is better)   10351793417  

Many thanks!

SteveDenham
Jade | Level 19

I am going to vote no on being a good fit.  You would really like the Chi squared/DF ratio to be close to one, and here it is over 4 million. So either your model or your distribution is inappropriate.

 

SteveDenham

StatDave
SAS Super FREQ

While I don't know if its performance has been studied in detail, Zheng (2000) presents an R-square measure for the GEE (marginal) model which is easy to compute. The following uses the respiratory data in Stokes et. al. (2012) that is modeled using a binary logistic GEE model. 

proc genmod data=resp2;
class id center;
model dichot(event="1") = di_base / link=logit dist=bin;
repeated subject=id*center / type=un;
output out=out resraw=res pred=pred;
run;
proc sql; 
select 1-(uss(res)/css(dichot)) as R2marg from out;
quit;

Zheng, B. (2000). Summarizing the goodness of fit of generalized linear models for longitudinal data. Statistics in Medicine, 19(10), 1265-1275.

Stokes, M. et. al. (2012). Categorical Data Analysis Using SAS, Third Edition, SAS Institute.

SteveDenham
Jade | Level 19

Learned something big there - that the summarizing options from PROC MEANS are available in PROC SQL.

 

SteveDenham

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 3854 views
  • 4 likes
  • 4 in conversation