As shown in the following example, the order of operations--demeaning vs. winsorizing--makes a big difference in the results. Which should I do first?
In the example below, we're starting with the 10 observations: 0, 0, 50, 50, 50, 50, 70, 70, 80, and 80. In the columns/procedure on the left, we demean first, In the columns/procedure on the right, we winsorize first. (I normally winsorize at the 1% and 99% levels--not 10% and 90%, but had to use the latter numbers for the sake of simplicity in the example.)
Thanks!
Observed X
Demeaned X
Winsorized (10%,90%)
Observed X
Winsorized (10%,90%)
Demeaned X
0
-50
0
0
50
-8
0
-50
0
0
50
-8
50
0
0
50
50
-8
50
0
0
50
50
-8
50
0
0
50
50
-8
50
0
0
50
50
-8
70
20
20
70
70
12
70
20
20
70
70
12
80
30
20
80
70
12
80
30
20
80
70
12
Mean:
50
Mean:
58
... View more