09-22-2015 04:45 PM
I am wondering that how do you evaluate your regression model. R square value may be the most popular one. However, it is not appropriate when it comes to nonlinear regression model, especially when one wants to compare linear regression VS nonlinear regression model.
My question is: is there any systematic performance measurement out there for linear or nonlinear regression models? Any comment is highly appreciated and thanks in advance!
09-23-2015 08:11 AM
I'm not sure why you say this. I would certainly compare the R-squared on a nonlinear model to the R-squared on a linear model on the same data.
However, it is not appropriate when it comes to nonlinear regression model, especially when one wants to compare linear regression VS nonlinear regression model.
09-23-2015 08:17 AM
There is a difference between a non-linear fit using linear regression (ie, including higher order terms) and non-linear regression. This article explains it well:
This aticle does a good job of explaining why R-squared is not a valid error assessment for non-linear regression.
09-23-2015 12:20 PM
In the context of data mining or predictive modeling you care about how accurate are your predictions.
I would look at misclassification and ROC index if I am predicting a binary or nominal target, and at average square error for any other target or response.
I hope it helps!
09-25-2015 03:02 PM - edited 09-25-2015 03:04 PM
I have a question for you. Suppose I am going to implement these mentioned equations below (FB, MG, etc) to evaluate my predictive model in SAS code node. My model is a linear model using stepwise selection like following:
data mydata; set &EM_IMPORT_DATA(in=a) &EM_IMPORT_VALIDATE(in=b) &EM_IMPORT_TEST(in=c); if a then _partition="_Train"; else if b then _partition="_Valid"; else if c then _partition="_Test"; run; proc glmselect DATA=mydata namelen=100; effect MyPoly = polynomial(A B... F/degree=5); model Y = MyPoly / selection=stepwise(select=SL SLE=0.05 SLS=0.05); partition rolevar=_partition(TEST='_Test' TRAIN='_Train' VALIDATE='_Valid'); run;
How can I perform these measurements in SAS? Is there any way I can calculate the residuals without knowing the prediction model? Thanks!
09-25-2015 02:45 PM
Thanks for your replying. After do some search online, I happened to get some ideas from " Technical Descriptions and User’s Guide
for the BOOT Statistical Model Evaluation Software Package, Version 2.0". The authors gave some statistical performance measurement as basis for a quality model evaluation, including following:
Where Cp is the predicted value and Co is the observation value, and C bar is the mean value.