BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ANKH1
Pyrite | Level 9
Thank you!
Rick_SAS
SAS Super FREQ

The documentation for the TRANSREG section on the Box-Cox transformation says,

"This family of transformations of the positive [emphasis added] dependent variable y is controlled by the parameter ...."

 

It also states that you can "specify the PARAMETER=c transformation option when you want to shift the values of y, usually to avoid negatives." So when DV=0, you can to get rid of the error and apply the BC transformation by adding any positve value, such as the following (which adds 1):

 

proc transreg data=Have;
   model BoxCox(DV / lambda=1 parameter=1) = class(cat);
run;
ANKH1
Pyrite | Level 9

Hi! 

Thanks, by adding the parameter=1 it ran. 

 

proc transreg details data=sample1 ;
model boxcox(NFSDDFI / lambda = -3 to 3 by 0.25 parameter =1) = CLASS(ANIMALCAT);
run;

 

I looked in "Details" and this was the output:

Capture.PNG

 

It recommends lambda=0. I log transformed but it didn't work since I have 11 zeros. 

Below is the code:

 

DATA sample2;
SET sample1;
TRANSFI=log(NFSDDFI);
RUN;

 

proc glm data=sample2;
class ANIMALCAT;
model TRANSFI= ANIMALCAT;
output out = notrans r= resid;
run;
symbol1 i=sm70;

 

proc univariate data=notrans noprint;
var resid;
histogram resid/normal kernel;
qqplot resid/normal (mu = est sigma=est);
run;

 

I tried using another DV that had no zeros but when I transformed the DV with the recommended lambda the DV was not normalized. 

 

Am I reading the output from proc transreg correctly?

Rick_SAS
SAS Super FREQ

Yes, you are reading the output correctly, but you need to incorporate the PARAMETER=1 information:

 

DATA sample2;
SET sample1;
TRANSFI=log(NFSDDFI + 1);
RUN;
ANKH1
Pyrite | Level 9

Perfect! But according to K-S the resids are still not normal, even though the histogram looks pretty normal to me. What should I do? Capture.PNGCapture2.PNG

Rick_SAS
SAS Super FREQ

If you have 19,900 observations, the test will always fail. Very large samples invariably produce statistically significant GOF tests.

 

I don't know what you are trying to accomplish, but you probably don't need to "do" anything. Your parameter estimates and contrasts are probably spot-on. From the graph, I would expect the p-values and CIs for the estimates to be conservative, since it looks like the distribution of the residuals has a smaller tail than normality. So your 95% CI might actually be a 96% or 97% CI.. 

ANKH1
Pyrite | Level 9

Ok, I will carry with our analyses with the transformed variable. Thank you very much!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 21 replies
  • 2029 views
  • 2 likes
  • 5 in conversation