Help using Base SAS procedures

nlin&goodness of fit

Reply
Occasional Contributor
Posts: 5

nlin&goodness of fit

I am building a logistic model using PROC nlin with a sample data.

Proc nlin data=aa ; Parms a=5 to 6 by 0.01 b=0 to 0.02 by 0.001;
Model y=k/(1+a*exp(-b*x));
Run;
Does anyone know how to get goodness of fit
Valued Guide
Valued Guide
Posts: 684

Re: nlin&goodness of fit

There are different views on how to assess goodness of fit for nonlinear models. Many investigators want a R^2 value, but some statisticians feel that R^2 values should not be calculated for nonlinear models. The book by Ratkowsky summarizes this perspective. However, I side with others that an "R^2-type" of statistic has value. This is not reported by NLIN, but can be determined manually. The statistic is sometimes called a pseudoR^2, and is defined using just one of the standard definitions for linear models:
pseudoR2 = 1 - (SSerror/SStotal(corrected))
You get SSerror directly from the table in the NLIN output. However, the SStotal(corrected) is not given in this table. There are reasons for this, but I won't go into these here. The NLIN output gives the uncorrected total SS (sum of squares around 0). You can get the total corrected sum of squares (sum of squares around the mean) for y using css with proc means:

proc means data=a css mean var ;
var y;
run;

Then do the calculation by hand.With a very bad fit of a model, the pseudo-R2 could actually be negative.
Note: in linear models with an intercept, the mean y corresponds to a reduced model (with an intercept and no other parameters). Thus, the regular R2 is a nice comparison of the relative change in sums of squares between a full and reduced model. But with most nonlinear models, the mean y does not correspond to a reduced model compared with the full model (the nonlinear one being fitted). This is partly why some do not like the idea of R2 for nonlinear models. To me, however, it is still interesting to see the fit of the nonlinear model relative to a model with only the mean y (even though this is not a special case of the other). I realize that others may write in disagreement with my view. I do agree that one must be cautious in interpretation.

There are other statistics, such as MSE, that can/should be used.

You probably want to look at residuals. Just add an output statement in NLIN like:
output out=preds predicted=p residual=r student=s;
There are other keywords that could be added. You can then plot the studentized residuals versus predicted values using GPLOT or SGSCATTER.
Another caution: it can be shown that there can still be a trend in the residual plot even when the appropriate nonlinear model is being fitted. The textbook by Schabenberger and Pierce discuss this at length. There are other types of residuals to consider, but these are tedious to calculate.

Finally: the NLIN procedure will have a nice upgrade in 9.3 of SAS. In particular, there will be some nice ods graphics that will enable you to do a wide range of model assessments.
Occasional Contributor
Posts: 5

Re: nlin&goodness of fit

Dear lvm
Thank you for your reply. I know little about SAS.Could you explain that a little more.
"You probably want to look at residuals. Just add an output statement in NLIN like:
output out=preds predicted=p residual=r student=s;"
I run the sentences ,but I didn't get the residuals. why?
how can use GPLOT or SGSCATTER get plot the studentized residuals versus predicted values?
Valued Guide
Valued Guide
Posts: 684

Re: nlin&goodness of fit

Here is an example (with a different model). The output statement stores the residuals and other stuff, as seen when the created file is printed. I also show how to use GPLOT.

data a;
input x y;
datalines;
0 0
1 2
2 5
3 10
4 10
5 12
6 12
7 15
8 14
9 15.5
;
proc nlin data=a;
parameters a 20 b .2;
model y = a*(1 - exp(-b*x));
output out=a_pred predicted=p student=s residual=r; *-file a_pred contains residuals;
run;
proc print data=a_pred;run;
proc means data=a uss css mean var ; *-css is corrected sum of squares;
var y;
run;
proc gplot data=a_pred; *-plots of observed and predicted y versus x, and residuals;
symbol1 color=blue h=2 v=dot i=none;
symbol2 color=red w=2 line=1 v=none i=join;
symbol3 color=black w=2 v=dot i=none;
plot (y p)*x / overlay;
plot s*p=3; *-plot studentized residuals versus predicted y;
plot r*p=3; *-plot regular residuals versus predicted y;
run;

Based on your questions, you probably need to learn more about sas, in general. There are many on-line resources, and many books where sas is used throughout.

Note, from the above output, the pseudoR2 is 1 - (8.4491/266.225) = 0.968.
Occasional Contributor
Posts: 5

Re: nlin&goodness of fit

Thank you. You are very kind to teach me.
Ask a Question
Discussion stats
  • 4 replies
  • 9321 views
  • 0 likes
  • 2 in conversation