BookmarkSubscribeRSS Feed
factorhedge
Fluorite | Level 6

I am using a financial panel data , I would like to run regression using Proc genmod , clustering by firm. I am surprised the out put does not provide Adjusted R square when the y variable is not binary. Could somebody help how to get adjusted R squared when I do regression using PROC GENMOD .

5 REPLIES 5
SteveDenham
Jade | Level 19

PROC GENMOD uses maximum likelihood methods rather than ordinary least squares to obtain estimates.  Consequently, adjusted R squared as it is usually thought of is not (and really cannot be) reported.  Further, I wonder how you "cluster" by firm in a procedure that does not accommodate hierarchical clustering--that would be the domain of PROC GLIMMIX.  But I could be misunderstanding the terminology of the particular field.

Anyway, Google is always your friend.  It tells me that scaled deviance and Pearson's chi square are the usual methods for generalized linear models, including those with a normal distribution.

Steve Denham

pink_poodle
Barite | Level 11

@SteveDenham ,

Is my generalized linear model with normal probability distribution and an identity link function a good fit?:

Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Scaled Deviance 2153 10351789298 4808076.7757
Pearson Chi-Square 2153 10351789298 4808076.7757
Scaled Pearson X2 2153 10351789298 4808076.7757
Log Likelihood   -5175896643  
Full Log Likelihood   -5175896643  
AIC (smaller is better)   10351793320  
AICC (smaller is better)   10351793321  
BIC (smaller is better)   10351793417  

Many thanks!

SteveDenham
Jade | Level 19

I am going to vote no on being a good fit.  You would really like the Chi squared/DF ratio to be close to one, and here it is over 4 million. So either your model or your distribution is inappropriate.

 

SteveDenham

StatDave
SAS Super FREQ

While I don't know if its performance has been studied in detail, Zheng (2000) presents an R-square measure for the GEE (marginal) model which is easy to compute. The following uses the respiratory data in Stokes et. al. (2012) that is modeled using a binary logistic GEE model. 

proc genmod data=resp2;
class id center;
model dichot(event="1") = di_base / link=logit dist=bin;
repeated subject=id*center / type=un;
output out=out resraw=res pred=pred;
run;
proc sql; 
select 1-(uss(res)/css(dichot)) as R2marg from out;
quit;

Zheng, B. (2000). Summarizing the goodness of fit of generalized linear models for longitudinal data. Statistics in Medicine, 19(10), 1265-1275.

Stokes, M. et. al. (2012). Categorical Data Analysis Using SAS, Third Edition, SAS Institute.

SteveDenham
Jade | Level 19

Learned something big there - that the summarizing options from PROC MEANS are available in PROC SQL.

 

SteveDenham

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 3581 views
  • 4 likes
  • 4 in conversation