BookmarkSubscribeRSS Feed
lei2004
Calcite | Level 5

Hi,

I have a question about how to get the R^2 by transformed linear model? This R^2 should be under the base model and original scale. the model is log(y)=b0+bi*x+e. The R^2 needed is the model transformed back to the y=exp(b0+b1*x). I am confused how to get the R^2 by the model y=exp(b0+b1*x+e)? Is there anyone can help to figure out the code? Thanks! 

I have attachment about this topic.

2 REPLIES 2
mkeintz
PROC Star

If you are using a linear regression to fit the transformed model log(y)=b0+bi*x, with bo and bi optimized to maximize R-square, then what do you actually mean by r-square for the "original" model Y=exp(b0+bi*x)?   In the estimated model, R-square is a proportional-reduction-in-error where model error is the sum of squared(actual-estimate)  = sum of squared(log(y)-estimate(log(y))) but I don't see how one could transform that to get a similarly-defined proportional-reduction-in-error in the original scale.

 

I guess you could get a correlation of the observed Y and the estimated exp(b0+bi*x), and I imagine there is a way to get an R2 from that.  I just don't know what it would mean.

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
FreelanceReinh
Jade | Level 19

Hi @lei2004 and welcome to the SAS Support Communities!

 

I agree that the formula for R1² in the article is a bit confusing, but (without having access to the original source by Kvålseth) I think you just need to insert the yi, their mean and the back-transformed predicted values (EDIT: that is: exp("(log yi) hat")).

 

Taking the first of the two numeric examples from section 3 of the article:

 

data have;
input x y;
log_y=log(y);
cards;
0 .5
1 4
2 6
3 7
16 12
20 22
;

proc summary data=have;
var y;
output out=stats css=css;
run;

proc reg data=have;
model log_y=x;
output out=pred p=log_y_hat;
quit;

data want(keep=sse css rsq);
if _n_=1 then set stats;
set pred end=last;
sse+(y-exp(log_y_hat))**2;
if last;
rsq=1-sse/css;
run;

Result:

  css        sse        rsq

287.208    34.9594    0.87828

(matching the authors' result 0.88)

 

For the second example I got rsq=-0.31642, again matching the (corrected) value in the article (up to a minor rounding issue).

 

Equivalently, you could obtain CSS from PROC REG (see ODS table ANOVA for model y=x, which could be included in the existing PROC REG step as model log_y y=x) instead of PROC SUMMARY.

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 528 views
  • 1 like
  • 3 in conversation