Proc Adaptivereg produces multiple fit statistics (see screenshot below). One statistic is ‘GCV R-Square’.
Thanks.
That's a good question. I asked a colleague who knows more about this are than I do. The following information is summarized from our discussions:
1. The interpretation is as a goodness-of-fit statistic, similar to the concept of the R-square statistic in OLS regression.
2. The GCV R-Square statistics is defined as 1-GCV(final model)/GCV(null model). The definition has a similar form to the traditional R-Square statistic. The difference here is that a nonparametric technique is used to solve regression problems, so the traditional R-Square statistic does not apply.
3. The ADAPTIVEREG algorithm is based on work by Friedman, who called his algorithm "MARS". Friedman modified the GCV function for MARS from its original version (Craven and Wahba) by manually setting the number of degrees of freedom per spline basis to a fixed number. So this version of the GCV is a specific criterion that does not otherwise appear in the literature. The criterion was mentioned in page 27 of ‘Estimating Functions of Mixed Ordinal and Categorical Variables Using Adaptive Splines’ by Friedman (1991), and in Chapter 5 of the MARS User Guide. The original Annals of Statistics paper by Friedman does not formally call the statistic "GCV R-Square", but the examples in that paper use similar ideas to quantify the goodness-of-fit for models.
I hope this gives you an overview, as well as specific references if you need more information.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.