Hi,
I have the following two data sets and below is the delay in days between actions by each of the sample groups.
The split was as follows:
Group A Test - 20%
Group B Control - 80%
With the data being negatively skewed im jut curious as to the best method for calculating if there is a significance change in days between groups.
Firstly as it is negatively skewed am i best to normalise the data set by using a logarithmic function?
Secondly with testing the significance should i be using the Kolmogorov-Smirnov Test or the Wilcoxon-Mann-Whitney (skewed distributions).
I have provided the output of the distribution of the groups below.
Any help would be greatly appreciated.
Thanks!
Log transformation would definitely not work (opposite effect from what you need). You could try the Box-Cox transformation. THis can be done with PROC TRANSREG (see example 6). If the data for the two groups are stacked, with g=0 for the first group and g=1 for the second, then you would use:
model boxcox(y) = identity(g);
Thanks for the reply, do you have a reference for example 6?
The reference is the SAS/STAT User's Guide. It was example 6 in an older version of SAS. This example has been replaced by a more complex example in SAS 9.3 and later:
This is more complex than you need.
I think the example you mentioned can still be found at: Documentation
Thanks again for your prompt response.
The groups are stacked so i have assigned a numeric value to each (0 = Control, Test = 1).
However when i go to process the below query, it is claiming there to be invalid values were encountered, i cannot find any more information online as to what is the cause....
proc transreg data=test;
model boxcox(day_delay)=identity(group_class);
run;
Looking at the values all are valid numbers and are formatted to be numeric (there are 0's in the data set).
Thanks again!
Oops, I wasn't paying attention that you have 0s. This transformation is only defined for nonzero positive values. You could add a small constant to all the data, but this becomes ad hoc (but commonly done). You might want to resort to nonparametric methods. You could use either of the NO tests you mentions.
Do you need to transform them if you're using non parametric tests?
The key is the distributions between the two groups are the same which by eyeballing, I'd say is pretty good.
Do not transform
Thanks again for your reply, so i dont need to transform, as either the Kolmogorov-Smirnov Test or the Wilcoxon-Mann-Whitney are both non-parametric.
Would you trial both tests? Or based on the above is one a better fit than the other?
Thanks for your reply it makes sense! Based on the above information would you use the Kolmogorov-Smirnov Test or the Wilcoxon-Mann-Whitney?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.