Programming the statistical procedures from SAS

R-Square on test and training dataset

Accepted Solution Solved
Reply
Contributor
Posts: 71
Accepted Solution

R-Square on test and training dataset

[ Edited ]

Using this code snippet:

 

PROC GLMSELECT DATA = WORK.Training TESTDATA = WORK.Test;
MODEL
.... / selection=none stb showpvalues;
ods output
"Fit Statistics" = WORK.Model_Fit
"Parameter Estimates" = WORK.ParameterEstimates
Nobs = WORK.Nobs;
RUN;

 

The dataset WORK.Model_Fit contains:

 

Root MSE
Dependent Mean
R-Square
Adj R-Sq
AIC
AICC
SBC
ASE (Train)
ASE (Test)

 

Is the R-Square measured on the training dataset? Is it possible to obatin the R-Square fro the training and test dataset?

 

 


Accepted Solutions
Solution
‎02-13-2017 05:41 AM
Respected Advisor
Posts: 4,606

Re: R-Square on test and training dataset

Here is an example:

 

/* Split a dataset into training and test subsets */
data splitClass;
set sashelp.class;
if mod(_n_, 3) > 0 then role = "training";
else role = "test";
run;

proc glmselect data=splitclass;
class sex;
model weight = sex height / selection=none;
partition rolevar=role(test="test" train="training");
output out=outClass residual=resWeight;
run;


proc sql noprint;
select 1 - uss(resWeight)/css(weight) as rsquare format=7.4
into :r2_training trimmed
from outClass where role="training";
select 1 - uss(resWeight)/css(weight) as rsquare format=7.4
into :r2_test trimmed
from outClass where role="test";
quit;

%put &=r2_training &=r2_test;
PG

View solution in original post


All Replies
Respected Advisor
Posts: 4,606

Re: R-Square on test and training dataset

One option is to calculate R square yourself from the residuals as 1 - USS(resid)/CSS(dependent) for each data subset.

PG
Grand Advisor
Posts: 9,458

Re: R-Square on test and training dataset

You can use SCORE or CODE statement to score new dataset and calculated R-square.
OR
proc glmselect;
model.........
run;

proc reg.......
model y=&_GLSIND ;
........

to get that R-square.

Contributor
Posts: 71

Re: R-Square on test and training dataset

Thanks, sorry this does not make much sense. I am able to use proc plm. I could use this to potentially calculate the r-square against the test dataset myself. The final aim is to put it into a macro variable.
Solution
‎02-13-2017 05:41 AM
Respected Advisor
Posts: 4,606

Re: R-Square on test and training dataset

Here is an example:

 

/* Split a dataset into training and test subsets */
data splitClass;
set sashelp.class;
if mod(_n_, 3) > 0 then role = "training";
else role = "test";
run;

proc glmselect data=splitclass;
class sex;
model weight = sex height / selection=none;
partition rolevar=role(test="test" train="training");
output out=outClass residual=resWeight;
run;


proc sql noprint;
select 1 - uss(resWeight)/css(weight) as rsquare format=7.4
into :r2_training trimmed
from outClass where role="training";
select 1 - uss(resWeight)/css(weight) as rsquare format=7.4
into :r2_test trimmed
from outClass where role="test";
quit;

%put &=r2_training &=r2_test;
PG
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 142 views
  • 1 like
  • 3 in conversation