Hi,
I have a question about how to get the R^2 by transformed linear model? This R^2 should be under the base model and original scale. the model is log(y)=b0+bi*x+e. The R^2 needed is the model transformed back to the y=exp(b0+b1*x). I am confused how to get the R^2 by the model y=exp(b0+b1*x+e)? Is there anyone can help to figure out the code? Thanks!
I have attachment about this topic.
If you are using a linear regression to fit the transformed model log(y)=b0+bi*x, with bo and bi optimized to maximize R-square, then what do you actually mean by r-square for the "original" model Y=exp(b0+bi*x)? In the estimated model, R-square is a proportional-reduction-in-error where model error is the sum of squared(actual-estimate) = sum of squared(log(y)-estimate(log(y))) but I don't see how one could transform that to get a similarly-defined proportional-reduction-in-error in the original scale.
I guess you could get a correlation of the observed Y and the estimated exp(b0+bi*x), and I imagine there is a way to get an R2 from that. I just don't know what it would mean.
Hi @lei2004 and welcome to the SAS Support Communities!
I agree that the formula for R1² in the article is a bit confusing, but (without having access to the original source by Kvålseth) I think you just need to insert the yi, their mean and the back-transformed predicted values (EDIT: that is: exp("(log yi) hat")).
Taking the first of the two numeric examples from section 3 of the article:
data have;
input x y;
log_y=log(y);
cards;
0 .5
1 4
2 6
3 7
16 12
20 22
;
proc summary data=have;
var y;
output out=stats css=css;
run;
proc reg data=have;
model log_y=x;
output out=pred p=log_y_hat;
quit;
data want(keep=sse css rsq);
if _n_=1 then set stats;
set pred end=last;
sse+(y-exp(log_y_hat))**2;
if last;
rsq=1-sse/css;
run;
Result:
css sse rsq 287.208 34.9594 0.87828
(matching the authors' result 0.88)
For the second example I got rsq=-0.31642, again matching the (corrected) value in the article (up to a minor rounding issue).
Equivalently, you could obtain CSS from PROC REG (see ODS table ANOVA for model y=x, which could be included in the existing PROC REG step as model log_y y=x) instead of PROC SUMMARY.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.