Re: Help on PROC GENMOD

factorhedge · Posted 06-26-2013 02:35 AM

I am using a financial panel data , I would like to run regression using Proc genmod , clustering by firm. I am surprised the out put does not provide Adjusted R square when the y variable is not binary. Could somebody help how to get adjusted R squared when I do regression using PROC GENMOD .

SteveDenham · Posted 06-26-2013 01:24 PM

PROC GENMOD uses maximum likelihood methods rather than ordinary least squares to obtain estimates. Consequently, adjusted R squared as it is usually thought of is not (and really cannot be) reported. Further, I wonder how you "cluster" by firm in a procedure that does not accommodate hierarchical clustering--that would be the domain of PROC GLIMMIX. But I could be misunderstanding the terminology of the particular field.

Anyway, Google is always your friend. It tells me that scaled deviance and Pearson's chi square are the usual methods for generalized linear models, including those with a normal distribution.

Steve Denham

pink_poodle · Posted 03-25-2021 11:14 AM

@SteveDenham ,

Is my generalized linear model with normal probability distribution and an identity link function a good fit?:

Criteria For Assessing Goodness Of Fit
Criterion	DF	Value	Value/DF
Scaled Deviance	2153	10351789298	4808076.7757
Pearson Chi-Square	2153	10351789298	4808076.7757
Scaled Pearson X2	2153	10351789298	4808076.7757
Log Likelihood		-5175896643
Full Log Likelihood		-5175896643
AIC (smaller is better)		10351793320
AICC (smaller is better)		10351793321
BIC (smaller is better)		10351793417

Many thanks!

SteveDenham · Posted 03-25-2021 11:24 AM

I am going to vote no on being a good fit. You would really like the Chi squared/DF ratio to be close to one, and here it is over 4 million. So either your model or your distribution is inappropriate.

SteveDenham

StatDave · Posted 03-31-2021 12:25 PM

While I don't know if its performance has been studied in detail, Zheng (2000) presents an R-square measure for the GEE (marginal) model which is easy to compute. The following uses the respiratory data in Stokes et. al. (2012) that is modeled using a binary logistic GEE model.

proc genmod data=resp2;
class id center;
model dichot(event="1") = di_base / link=logit dist=bin;
repeated subject=id*center / type=un;
output out=out resraw=res pred=pred;
run;
proc sql; 
select 1-(uss(res)/css(dichot)) as R2marg from out;
quit;

Zheng, B. (2000). Summarizing the goodness of fit of generalized linear models for longitudinal data. Statistics in Medicine, 19(10), 1265-1275.

Stokes, M. et. al. (2012). Categorical Data Analysis Using SAS, Third Edition, SAS Institute.

SteveDenham · Posted 04-01-2021 08:42 AM

Learned something big there - that the summarizing options from PROC MEANS are available in PROC SQL.

SteveDenham