I am using a financial panel data , I would like to run regression using Proc genmod , clustering by firm. I am surprised the out put does not provide Adjusted R square when the y variable is not binary. Could somebody help how to get adjusted R squared when I do regression using PROC GENMOD .
PROC GENMOD uses maximum likelihood methods rather than ordinary least squares to obtain estimates. Consequently, adjusted R squared as it is usually thought of is not (and really cannot be) reported. Further, I wonder how you "cluster" by firm in a procedure that does not accommodate hierarchical clustering--that would be the domain of PROC GLIMMIX. But I could be misunderstanding the terminology of the particular field.
Anyway, Google is always your friend. It tells me that scaled deviance and Pearson's chi square are the usual methods for generalized linear models, including those with a normal distribution.
Steve Denham
Is my generalized linear model with normal probability distribution and an identity link function a good fit?:
Criteria For Assessing Goodness Of Fit | |||
---|---|---|---|
Criterion | DF | Value | Value/DF |
Scaled Deviance | 2153 | 10351789298 | 4808076.7757 |
Pearson Chi-Square | 2153 | 10351789298 | 4808076.7757 |
Scaled Pearson X2 | 2153 | 10351789298 | 4808076.7757 |
Log Likelihood | -5175896643 | ||
Full Log Likelihood | -5175896643 | ||
AIC (smaller is better) | 10351793320 | ||
AICC (smaller is better) | 10351793321 | ||
BIC (smaller is better) | 10351793417 |
Many thanks!
I am going to vote no on being a good fit. You would really like the Chi squared/DF ratio to be close to one, and here it is over 4 million. So either your model or your distribution is inappropriate.
SteveDenham
While I don't know if its performance has been studied in detail, Zheng (2000) presents an R-square measure for the GEE (marginal) model which is easy to compute. The following uses the respiratory data in Stokes et. al. (2012) that is modeled using a binary logistic GEE model.
proc genmod data=resp2;
class id center;
model dichot(event="1") = di_base / link=logit dist=bin;
repeated subject=id*center / type=un;
output out=out resraw=res pred=pred;
run;
proc sql;
select 1-(uss(res)/css(dichot)) as R2marg from out;
quit;
Zheng, B. (2000). Summarizing the goodness of fit of generalized linear models for longitudinal data. Statistics in Medicine, 19(10), 1265-1275.
Stokes, M. et. al. (2012). Categorical Data Analysis Using SAS, Third Edition, SAS Institute.
Learned something big there - that the summarizing options from PROC MEANS are available in PROC SQL.
SteveDenham
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.