## Counting Outliers

Hi All,,

I'm new to SAS and my task is to calculate shares of outliers for  my date. I have already prepared the following code but stopped after joining table.

Creating descriptive statistics
title 'Descriptive statistics for each variable from "valid" dataset';
ods exclude _all_;
proc means data=valid
n min p1 p5 p10 q1 p50 q3 p90 p95 p99 max mean std qrange range
STACKODSOUTPUT;
var _numeric_ ;
ods output summary=summary_valid;
run;
ods exclude none;

Creating a table with outlier bounds data
proc sql;
create table outlier_valid as
select variable, Q1-1.5*QRANGE as lb_mid, Q3+1.5*QRANGE as ub_mid, Q1-3*QRANGE as lb_extr, Q3+3*QRANGE as ub_extr
from summary_valid
order by variable desc, variable;
quit;

proc print data=outlier_valid noobs;
title1 'The summary of outliers data valid';
title2 'lb_mid = lower bound for mid outliers';
title3 'ub_mid = upper bound for mid outliers';
title4 'lb_extr = lower bound for extreme outliers';
title5 'ub_extr = upper bound for extreme outliers';
run;

Joining two above tables
proc sql;
create table train_join as
select s.variable, s.n, s.MIN, s.P1, s.P5, s.P10, s.Q1, s.P50, S.Q3, s.P90, s.P95, s.P99, s.MAX, s.MEAN, s.STDDEV, s.QRANGE, s.RANGE, o.lb_mid, o.ub_mid, o.lb_extr, o.ub_extr
from summary_train s, outlier_train o
where s.variable = o.variable;
quit;

Now I want to create a variable representing count of outliers which:
are lower than lb_mid
are higher than up_mid
are lower than lb_extr
are higher than up_extr

I would also like to create variable mild_outliers (true/false) and extrr_outliers (true/false)

Could you please give me some tips?
Kind regards,
Cezary

## Re: Counting Outliers

Hi,

You can use another nice procedure to identify outliers, the Univariate procedure. It can save the outliers in a data set and also the statistics to determine the outliers. I share you an example and links to the documentation. https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=procstat&docsetTarget=pro...

``````
ods output ExtremeObs=outliers;
proc univariate data=sashelp.class outtable=work.all ;
var Height age;
run;``````