Hi SAS Users,
Now I come to the winsorizing stage at 1% level (replace the value from 99% to 100% by the value at 99%, and replacing the value lower than 1% by value at 1%).
And when using the code customized by @mkeintz in 2012, it still works really well.
I already test with the dataset sashelp.shoes for variables sales and Inventory and it works perfectly so far
%let L=1; %* 1th percentile *;
%let H=%eval(100 - &L); %* 99th percentile*;
proc univariate data=sashelp.shoes noprint;
var sales Inventory;
output out=_winsor pctlpts=&L &H pctlpre=__sales __Inventory ;
run;
data want (drop=__:);
set sashelp.shoes;
if _n_=1 then set _winsor;
array wlo {*} __sales&L __Inventory&L ;
array whi {*} __sales&H __Inventory&H ;
array wval {*} wsales wInventory ;
array val {*} sales Inventory ;
do _V=1 to dim(val);
wval{_V}=min(max(val{_V},wlo{_V}),whi{_V});
end;
run;
ods output Quantiles=outlier1;
proc univariate data=want;
var wsales wInventory;
run;
But the dataset sashelp.shoes contains no missing observation regarding the variables sales and Inventory but it is not always the case in reality.
And we all know that the value of a missing variable is always the smallest so I have the feeling that this code above will not work properly with a variable that has missing observations?
So, I am wondering if there is any way to adjust the code above to winsorize the variables sales(assuming the variable sales has missing observation) but just with the numeric observations (do not care about the missing observations).
Many thanks and warm regards.
MIN/MAX() functions ignore missing so you should test your assumption. Make up some sample data and see what happens when you have missing values.
FYI - if it is an issue, the fix is a single IF statement to add some conditional logic. Once you have a use case to test, you can easily test various IF conditions to get the one you need.
And we all know that the value of a missing variable is always the smallest so I have the feeling that this code above will not work properly with a variable that has missing observations?
MIN/MAX() functions ignore missing so you should test your assumption. Make up some sample data and see what happens when you have missing values.
FYI - if it is an issue, the fix is a single IF statement to add some conditional logic. Once you have a use case to test, you can easily test various IF conditions to get the one you need.
And we all know that the value of a missing variable is always the smallest so I have the feeling that this code above will not work properly with a variable that has missing observations?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.