Here is SAS Macro that I created for outlier detection as per my requirements, more information visit here: https://seleritysas.com/blog/2020/12/10/sas-custom-macros-that-make-feature-engineering-easy-for-data-scientists-data-engineers-and-machine-learning-specialists/ --------------------------------------- Macro Definition---------------------------------------------------------- %macro outliers(dat,var); options nonotes; proc univariate data=&dat normal noprint; var &var; output out=ttest normaltest=Test probn=P_Value; run; Data _Null_; set ttest; %if P_value > 0.05 %Then %do; option notes; %put NOTE: &var is normally distributed hence it select STD method to find Outliers.; %put NOTE: You can check statistics and pvalue in work.ttest table; options nonotes; Proc SQL noprint; Select Mean(&var) into: me from &dat; select std(&var) into:sd from &dat; quit; run; Data outlier; set &dat; %Let Min_cutoff= %sysevalf(&me - (3* &sd)); %Let Max_cutoff= %sysevalf(&me + (3* &sd)); where &var < &Min_cutoff or &var > &Max_cutoff; run; %end; %else %do; options notes; %put NOTE: &var is not normally distributed hence it select percentile method to find Outliers.; %put NOTE: You can check statistics and pvalue in work.ttest table & percentile values in work.ranges table; options nonotes; proc means data=&dat stackods n qrange p1 p99 ; var &var; ods output summary=ranges; run; proc sql noprint; select P1 into:Min from ranges; select P99 into : Max from Ranges; quit; run; Data outliers; set &dat; Where &var < &Min or &var > &Max; run; %end; options notes; %mend; ------------------------------------------- Macro Testing --------------------------------------------------------- options nomprint nomlogic nosymbolgen; %outliers(Lib.dataset_name, Variable_Name)
... View more