I saw the previous thread about pseudo-R2 formula for PROC NLIN: https://communities.sas.com/t5/Statistical-Procedures/nlin-goodness-of-fit/td-p/71273
Is there anyone knowing a literature about this pseudo R2 formula "pseudoR2 = 1 - (SSerror/SStotal(corrected))"? I couldn't find the provenance of this formula. There are several different pseudo R2s, such as McFadden’s R2, Efron’s R2, . McKelvey & Zavoina’s R2.
@Rick_SAS, @SteveDenham, @lvm, do you know a good reference of the formula? My regression is not a logit or probit regression. This is my linear model
proc nlin data=one method=marquardt;
P=0.85;
parms b0=95 b1=.19 xp=20.0;
model cumgerm = b0 / (1 + ((1-P)/P) * exp(-b1*(gdd-xP)));
* output out=A1 p=pred r=resid;
run;
Thanks very much!
Update: Sorry, no need about the reference. The formula is just a regular R2 for linear model.
However, is there any other Pseudo R2 can be obtained based on results from PROC NLIN? I know some of them based on loglikelihood, or for nominal response, so not sure which one can be used for continuous response and continuous explanatory variables like in my model. Thanks!!
Well, first off, the MODEL statement you have here isn't linear in the parameters (although it could be if you took the log on both sides), so the standard Rsquared is probably not valid. CrossValidated has a bunch of replies about these, but the one that makes the most sense to me is McFadden's likelihood based pseudo Rsquared. To get that, try fitting your model using PROC NLMIXED. The null model that you would compare to would be
model cumgerm = ;
I chose this as the model because setting b0 to zero makes the right hand side identically zero for all values of gdd, and this achieves the same thing I hope. If not, then move the param values to direct code values like the P=0.85 with all set to 0.
Once you have both log likelihoods you can use 1 - LL(fit model)/LL(null model) to calculate the pseudo Rsquared. Having both the response and explanatory variables as continuous actually makes this a bit easier.
SteveDenham
Well, first off, the MODEL statement you have here isn't linear in the parameters (although it could be if you took the log on both sides), so the standard Rsquared is probably not valid. CrossValidated has a bunch of replies about these, but the one that makes the most sense to me is McFadden's likelihood based pseudo Rsquared. To get that, try fitting your model using PROC NLMIXED. The null model that you would compare to would be
model cumgerm = ;
I chose this as the model because setting b0 to zero makes the right hand side identically zero for all values of gdd, and this achieves the same thing I hope. If not, then move the param values to direct code values like the P=0.85 with all set to 0.
Once you have both log likelihoods you can use 1 - LL(fit model)/LL(null model) to calculate the pseudo Rsquared. Having both the response and explanatory variables as continuous actually makes this a bit easier.
SteveDenham
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.