As shown in the following example, the order of operations--demeaning vs. winsorizing--makes a big difference in the results. Which should I do first?
In the example below, we're starting with the 10 observations: 0, 0, 50, 50, 50, 50, 70, 70, 80, and 80. In the columns/procedure on the left, we demean first, In the columns/procedure on the right, we winsorize first. (I normally winsorize at the 1% and 99% levels--not 10% and 90%, but had to use the latter numbers for the sake of simplicity in the example.)
Thanks!
Observed X | Demeaned X | Winsorized (10%,90%) | Observed X | Winsorized (10%,90%) | Demeaned X | ||
0 | -50 | 0 | 0 | 50 | -8 | ||
0 | -50 | 0 | 0 | 50 | -8 | ||
50 | 0 | 0 | 50 | 50 | -8 | ||
50 | 0 | 0 | 50 | 50 | -8 | ||
50 | 0 | 0 | 50 | 50 | -8 | ||
50 | 0 | 0 | 50 | 50 | -8 | ||
70 | 20 | 20 | 70 | 70 | 12 | ||
70 | 20 | 20 | 70 | 70 | 12 | ||
80 | 30 | 20 | 80 | 70 | 12 | ||
80 | 30 | 20 | 80 | 70 | 12 | ||
Mean: | 50 | Mean: | 58 |
If the purpose of these operations is to protect against outliers, you should winsorize before centering, because outliers can have a very large influence on the mean used for centering. So, better remove them first.
If the purpose of these operations is to protect against outliers, you should winsorize before centering, because outliers can have a very large influence on the mean used for centering. So, better remove them first.
To echo PGStasts, the Winsorized mean is a robust estimate of location. If your goal is to center the data in a robust way, use a robust estimate. If you are going to scale the data, use a robust estimate of scale.
Thanks very much Rick! What you and PG are saying makes perfect sense!
J.J.
Thanks very much PG! What you and Rick are saying makes perfect sense!
J.J.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.