BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jjsingh04
Obsidian | Level 7

As shown in the following example, the order of operations--demeaning vs. winsorizing--makes a big difference in the results. Which should I do first? 

 

In the example below, we're starting with the 10 observations: 0, 0, 50, 50, 50, 50, 70, 70, 80, and 80. In the columns/procedure on the left, we demean first, In the columns/procedure on the right, we winsorize first. (I normally winsorize at the 1% and 99% levels--not 10% and 90%, but had to use the latter numbers for the sake of simplicity in the example.) 

 

Thanks! 

 

 

  Observed X Demeaned X Winsorized (10%,90%)   Observed X Winsorized (10%,90%) Demeaned X
  0 -50 0   0 50 -8
  0 -50 0   0 50 -8
  50 0 0   50 50 -8
  50 0 0   50 50 -8
  50 0 0   50 50 -8
  50 0 0   50 50 -8
  70 20 20   70 70 12
  70 20 20   70 70 12
  80 30 20   80 70 12
  80 30 20   80 70 12
Mean: 50     Mean:   58  
Our lives are enriched by the people around us.
1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

If the purpose of these operations is to protect against outliers, you should winsorize before centering, because outliers can have a very large influence on the mean used for centering. So, better remove them first.

PG

View solution in original post

4 REPLIES 4
PGStats
Opal | Level 21

If the purpose of these operations is to protect against outliers, you should winsorize before centering, because outliers can have a very large influence on the mean used for centering. So, better remove them first.

PG
Rick_SAS
SAS Super FREQ

To echo PGStasts, the Winsorized mean is a robust estimate of location. If your goal is to center the data in a robust way, use a robust estimate.  If you are going to scale the data, use a robust estimate of scale.

jjsingh04
Obsidian | Level 7

Thanks very much Rick! What you and PG are saying makes perfect sense! 

J.J.

Our lives are enriched by the people around us.
jjsingh04
Obsidian | Level 7

Thanks very much PG! What you and Rick are saying makes perfect sense! 

J.J.

 

Our lives are enriched by the people around us.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 762 views
  • 5 likes
  • 3 in conversation