- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am using a financial panel data , I would like to run regression using Proc genmod , clustering by firm. I am surprised the out put does not provide Adjusted R square when the y variable is not binary. Could somebody help how to get adjusted R squared when I do regression using PROC GENMOD .
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PROC GENMOD uses maximum likelihood methods rather than ordinary least squares to obtain estimates. Consequently, adjusted R squared as it is usually thought of is not (and really cannot be) reported. Further, I wonder how you "cluster" by firm in a procedure that does not accommodate hierarchical clustering--that would be the domain of PROC GLIMMIX. But I could be misunderstanding the terminology of the particular field.
Anyway, Google is always your friend. It tells me that scaled deviance and Pearson's chi square are the usual methods for generalized linear models, including those with a normal distribution.
Steve Denham
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is my generalized linear model with normal probability distribution and an identity link function a good fit?:
Criteria For Assessing Goodness Of Fit | |||
---|---|---|---|
Criterion | DF | Value | Value/DF |
Scaled Deviance | 2153 | 10351789298 | 4808076.7757 |
Pearson Chi-Square | 2153 | 10351789298 | 4808076.7757 |
Scaled Pearson X2 | 2153 | 10351789298 | 4808076.7757 |
Log Likelihood | -5175896643 | ||
Full Log Likelihood | -5175896643 | ||
AIC (smaller is better) | 10351793320 | ||
AICC (smaller is better) | 10351793321 | ||
BIC (smaller is better) | 10351793417 |
Many thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am going to vote no on being a good fit. You would really like the Chi squared/DF ratio to be close to one, and here it is over 4 million. So either your model or your distribution is inappropriate.
SteveDenham
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
While I don't know if its performance has been studied in detail, Zheng (2000) presents an R-square measure for the GEE (marginal) model which is easy to compute. The following uses the respiratory data in Stokes et. al. (2012) that is modeled using a binary logistic GEE model.
proc genmod data=resp2;
class id center;
model dichot(event="1") = di_base / link=logit dist=bin;
repeated subject=id*center / type=un;
output out=out resraw=res pred=pred;
run;
proc sql;
select 1-(uss(res)/css(dichot)) as R2marg from out;
quit;
Zheng, B. (2000). Summarizing the goodness of fit of generalized linear models for longitudinal data. Statistics in Medicine, 19(10), 1265-1275.
Stokes, M. et. al. (2012). Categorical Data Analysis Using SAS, Third Edition, SAS Institute.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Learned something big there - that the summarizing options from PROC MEANS are available in PROC SQL.
SteveDenham