Hi SAS Users,
Now I come to the winsorizing stage at 1% level (replace the value from 99% to 100% by the value at 99%, and replacing the value lower than 1% by value at 1%).
And when using the code customized by @mkeintz in 2012, it still works really well.
I already test with the dataset sashelp.shoes for variables sales and Inventory and it works perfectly so far
%let L=1; %* 1th percentile *;
%let H=%eval(100 - &L); %* 99th percentile*;
proc univariate data=sashelp.shoes noprint;
var sales Inventory;
output out=_winsor pctlpts=&L &H pctlpre=__sales __Inventory ;
run;
data want (drop=__:);
set sashelp.shoes;
if _n_=1 then set _winsor;
array wlo {*} __sales&L __Inventory&L ;
array whi {*} __sales&H __Inventory&H ;
array wval {*} wsales wInventory ;
array val {*} sales Inventory ;
do _V=1 to dim(val);
wval{_V}=min(max(val{_V},wlo{_V}),whi{_V});
end;
run;
ods output Quantiles=outlier1;
proc univariate data=want;
var wsales wInventory;
run;
But the dataset sashelp.shoes contains no missing observation regarding the variables sales and Inventory but it is not always the case in reality.
And we all know that the value of a missing variable is always the smallest so I have the feeling that this code above will not work properly with a variable that has missing observations?
So, I am wondering if there is any way to adjust the code above to winsorize the variables sales(assuming the variable sales has missing observation) but just with the numeric observations (do not care about the missing observations).
Many thanks and warm regards.
MIN/MAX() functions ignore missing so you should test your assumption. Make up some sample data and see what happens when you have missing values.
FYI - if it is an issue, the fix is a single IF statement to add some conditional logic. Once you have a use case to test, you can easily test various IF conditions to get the one you need.
And we all know that the value of a missing variable is always the smallest so I have the feeling that this code above will not work properly with a variable that has missing observations?
MIN/MAX() functions ignore missing so you should test your assumption. Make up some sample data and see what happens when you have missing values.
FYI - if it is an issue, the fix is a single IF statement to add some conditional logic. Once you have a use case to test, you can easily test various IF conditions to get the one you need.
And we all know that the value of a missing variable is always the smallest so I have the feeling that this code above will not work properly with a variable that has missing observations?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.