I am trying to use code I found in an old thread to winsorize variables. This code runs w/o error, but the means for the winsorized vars (wvar) are no different than the unwinsorized vars.
There are a few things I don't understand (Questions also embedded in the syntax).
proc univariate data=have noprint; var var1 var2 var3; output out=_testing pctlpts=10 90 pctlpre=__var1 __var2 __var3; run; data want; set have; if _n_=1 then set _testing ; * What is _n_? What role is it playing here?; array wlo {*} __ var1_10 __var2_10 __var2_10; array whi {*} __ var1_90 __var2_90 __var2_90; array wval {*} wvar1 wvar2 wvar3; * Are these the variables that are supposed to contain the winsorized means?
array val {*} var1 var2 var3; do _V=1 to dim(val); wval{_V}=min(max(val{_V},wlo{_V}),whi{_V});
* Any help interpreting this?
* Are these the "min" and "max" functions here? What is the purpose of including them here? end; run;
Thank you very much for all your continued support. It is greatly appreciated. I would be lost w/o this forum.
Hi @_maldini_,
In this brand new thread I've just pointed out that calculating Winsorized means "manually" (i.e. in a data step, possibly using a macro) can lead to different results than using PROC UNIVARIATE's WINSORIZED= option. I think this option was introduced only in SAS 9 [EDIT: no, I was wrong, sorry, it was introduced in SAS version 7], so that programs and macros for manual calculation might be obsolete, unless you need the Winsorized data (not only the means) or you insist on using a different algorithm.
I can take a closer look at your program tomorrow (CET) if you still want to use it.
<unless you...you insist on using a different algorithm>
I certainly do not. I am looking for the easiest solution to this problem. I've never used a macro in SAS however...
Thanks for your help!
@FreelanceReinh Thank you for acknowledging that a naive Winsorization, especially in the presence of missing values or repeated values, can lead to wrong answers. I have mentioned this in other threads, but it tends to be overlooked.
I believe that the correct way to Winsorize data is given in my article "How to Winsorize Data in SAS." When testing "manual" methods, be sure to use a data set that has missing and repeated values, such as Sashelp.Heart.
I highly recommend reading through the first link @Ksharp posted.
To the OP: Why are you wanting to Winsorize the data? What problem are you trying to solve?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.