Programming the statistical procedures from SAS

How to winsorize variables

Reply
Regular Contributor
Posts: 196

How to winsorize variables

I am trying to use code I found in an old thread to winsorize variables. This code runs w/o error, but the means for the winsorized vars (wvar) are no different than the unwinsorized vars.

 

There are a few things I don't understand (Questions also embedded in the syntax).

  1. What is _n_? What role is it playing?
  2. Are these the winsorized variables (i.e. wvar1  wvar2  wvar3)?
    1. If so, the means are no different than the means of the variables prior to the "winsorization"
  3. How would I interpret this "min(max(val{_V},wlo{_V}),whi{_V});"? I'm confused by all these embedded arrays. I'm a relative beginner w/ SAS.  
  4. Are these the "min" and "max" functions here? What is the purpose of including them here?
proc univariate data=have noprint;
   var var1 var2 var3;
   output out=_testing  pctlpts=10 90  pctlpre=__var1 __var2 __var3;
run;

data want;
  set have;
  if _n_=1 then set _testing ;
* What is _n_? What role is it playing here?; 
 
 array wlo  {*} __ var1_10  __var2_10 __var2_10;
 array whi  {*} __ var1_90  __var2_90 __var2_90;  
 array wval {*} wvar1  wvar2  wvar3;
* Are these the variables that are supposed to contain the winsorized means?
array val {*} var1 var2 var3; do _V=1 to dim(val); wval{_V}=min(max(val{_V},wlo{_V}),whi{_V});
* Any help interpreting this?
* Are these the "min" and "max" functions here? What is the purpose of including them here? end; run;

Thank you very much for all your continued support. It is greatly appreciated. I would be lost w/o this forum. 

Trusted Advisor
Posts: 1,114

Re: How to winsorize variables

[ Edited ]

Hi @jcorroon,

 

In this brand new thread I've just pointed out that calculating Winsorized means "manually" (i.e. in a data step, possibly using a macro) can lead to different results than using PROC UNIVARIATE's WINSORIZED= option. I think this option was introduced only in SAS 9 [EDIT: no, I was wrong, sorry, it was introduced in SAS version 7], so that programs and macros for manual calculation might be obsolete, unless you need the Winsorized data (not only the means) or you insist on using a different algorithm.

 

I can take a closer look at your program tomorrow (CET) if you still want to use it.

Regular Contributor
Posts: 196

Re: How to winsorize variables

@FreelanceReinhard 

 

<unless you...you insist on using a different algorithm>

 

I certainly do not. I am looking for the easiest solution to this problem. I've never used a macro in SAS however...

 

Thanks for your help!

 

 

SAS Super FREQ
Posts: 3,309

Re: How to winsorize variables

@FreelanceReinhard Thank you for acknowledging that a naive Winsorization, especially in the presence of missing values or repeated values,  can lead to wrong answers.  I have mentioned this in other threads, but it tends to be overlooked. 

 

I believe that the correct way to Winsorize data is given in my article "How to Winsorize Data in SAS." When testing "manual" methods, be sure to use a data set that has missing and repeated values, such as Sashelp.Heart.

Grand Advisor
Posts: 9,457
Grand Advisor
Posts: 16,893

Re: How to winsorize variables

I highly recommend reading through the first link @Ksharp posted.

 

SAS Super FREQ
Posts: 3,309

Re: How to winsorize variables

To the OP: Why are you wanting to Winsorize the data? What problem are you trying to solve?

Ask a Question
Discussion stats
  • 6 replies
  • 307 views
  • 2 likes
  • 5 in conversation