## How to winsorize variables

Regular Contributor
Posts: 202

# How to winsorize variables

I am trying to use code I found in an old thread to winsorize variables. This code runs w/o error, but the means for the winsorized vars (wvar) are no different than the unwinsorized vars.

There are a few things I don't understand (Questions also embedded in the syntax).

1. What is _n_? What role is it playing?
2. Are these the winsorized variables (i.e. wvar1  wvar2  wvar3)?
1. If so, the means are no different than the means of the variables prior to the "winsorization"
3. How would I interpret this "min(max(val{_V},wlo{_V}),whi{_V});"? I'm confused by all these embedded arrays. I'm a relative beginner w/ SAS.
4. Are these the "min" and "max" functions here? What is the purpose of including them here?
```proc univariate data=have noprint;
var var1 var2 var3;
output out=_testing  pctlpts=10 90  pctlpre=__var1 __var2 __var3;
run;

data want;
set have;
if _n_=1 then set _testing ;
* What is _n_? What role is it playing here?;

array wlo  {*} __ var1_10  __var2_10 __var2_10;
array whi  {*} __ var1_90  __var2_90 __var2_90;
array wval {*} wvar1  wvar2  wvar3;
* Are these the variables that are supposed to contain the winsorized means?
array val   {*} var1 var2 var3;

do _V=1 to dim(val);
wval{_V}=min(max(val{_V},wlo{_V}),whi{_V});  * Any help interpreting this?  * Are these the "min" and "max" functions here? What is the purpose of including them here?
end;
run;
```

Thank you very much for all your continued support. It is greatly appreciated. I would be lost w/o this forum.

Posts: 1,125

## Re: How to winsorize variables

[ Edited ]

Hi @jcorroon,

In this brand new thread I've just pointed out that calculating Winsorized means "manually" (i.e. in a data step, possibly using a macro) can lead to different results than using PROC UNIVARIATE's WINSORIZED= option. I think this option was introduced only in SAS 9 [EDIT: no, I was wrong, sorry, it was introduced in SAS version 7], so that programs and macros for manual calculation might be obsolete, unless you need the Winsorized data (not only the means) or you insist on using a different algorithm.

I can take a closer look at your program tomorrow (CET) if you still want to use it.

Regular Contributor
Posts: 202

## Re: How to winsorize variables

<unless you...you insist on using a different algorithm>

I certainly do not. I am looking for the easiest solution to this problem. I've never used a macro in SAS however...

SAS Super FREQ
Posts: 3,839

## Re: How to winsorize variables

@FreelanceReinhard Thank you for acknowledging that a naive Winsorization, especially in the presence of missing values or repeated values,  can lead to wrong answers.  I have mentioned this in other threads, but it tends to be overlooked.

I believe that the correct way to Winsorize data is given in my article "How to Winsorize Data in SAS." When testing "manual" methods, be sure to use a data set that has missing and repeated values, such as Sashelp.Heart.

Super User
Posts: 10,214

Super User
Posts: 20,755

SAS Super FREQ
Posts: 3,839

## Re: How to winsorize variables

To the OP: Why are you wanting to Winsorize the data? What problem are you trying to solve?

Discussion stats
• 6 replies
• 425 views
• 2 likes
• 5 in conversation