Hi,
I have a question regarding to the variable transformation by using Box-Cox.
I'm a student taking Regression Analysis class, and here is the code for the example:
data plasma;
input age plevel;
datalines;
0 13.44
0 12.84
0 11.91
0 20.09
0 15.60
1.0 10.11
1.0 11.38
1.0 10.28
1.0 8.96
1.0 8.59
2.0 9.83
2.0 9.00
2.0 8.65
2.0 7.85
2.0 8.88
3.0 7.94
3.0 6.01
3.0 5.14
3.0 6.90
3.0 6.77
4.0 4.86
4.0 5.10
4.0 5.67
4.0 5.75
4.0 6.23
;
run;
proc reg data=plasma;
model plevel=age;
run;
ods output boxcox=bc details=details;
proc transreg data=plasma PBOXCOXTABLE detail;
model boxcox(plevel/ lambda= -1.2 to 1.2 by 0.1 convenient)
= identity(age);
output out = bc_plasma;
run;
proc print data=bc_plasma;
run;
proc reg data=bc_plasma;
model tplevel=age;
run;
so the best lambda for the transformation is -0.50. I have verified and get the same lambda by manually calculating the BoxCox formula in R.
However, I am wondering where is the new variable Y transformation is actually calculated from?
From my textbook, after getting lambda=-0.50. Then the Y-transformation is going to be Y^(-0.50)
So, saying the first observation Y=13.44, the Y-transformation by using lambda=-0.5 is 0.2727
But in output is 2.59823, and is not from (Y^(-0.50) -1)/(-0.5).
I am hesitate on whether directly use this output variable as the Y transformation to fit the regression line because I don't know how it is calculated.
Can anyone explain ?
Thanks!
You asked for a convenient lambda, and as shown in the output, it is 0, so that is what transreg does. If you remove the convenient option, you will get what you expect.
You asked for a convenient lambda, and as shown in the output, it is 0, so that is what transreg does. If you remove the convenient option, you will get what you expect.
Thank you so much!!
Now I got the correct Y transformation output for the Best Lambda at lambda=-0.5 🙂
But talking about the convenient lambda=0 in this example, isn't that the transformation suppose to be log(Y)=log(13.44)=1.128 ?it is not the output again though...
How is this transformation calculated then..?
What is the difference between choosing Best Lambda and the Convenient Lambda? Any reason of deciding to choose the convenient lambda when we know both the Best and the Convenient?
Thanks!
Use natural log (base e) not log base 10.
Check out the convenient and cll= option. Some analysts might prefer a more meaningful transformation (e.g. linear, log, or square root) over a less meaningful transformation if the parameter for a meaningful transformation is in the confidence interval.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.