BookmarkSubscribeRSS Feed
Calcite | Level 5
Hi All,,
I'm new to SAS and my task is to calculate shares of outliers for  my date. I have already prepared the following code but stopped after joining table.
Creating descriptive statistics
title 'Descriptive statistics for each variable from "valid" dataset';
ods exclude _all_;
proc means data=valid
           n min p1 p5 p10 q1 p50 q3 p90 p95 p99 max mean std qrange range
var _numeric_ ;
ods output summary=summary_valid;
ods exclude none;
Creating a table with outlier bounds data
proc sql;
create table outlier_valid as
select variable, Q1-1.5*QRANGE as lb_mid, Q3+1.5*QRANGE as ub_mid, Q1-3*QRANGE as lb_extr, Q3+3*QRANGE as ub_extr 
from summary_valid
order by variable desc, variable;
proc print data=outlier_valid noobs;
 title1 'The summary of outliers data valid';
  title2 'lb_mid = lower bound for mid outliers';
  title3 'ub_mid = upper bound for mid outliers';
  title4 'lb_extr = lower bound for extreme outliers';
  title5 'ub_extr = upper bound for extreme outliers';
Joining two above tables
proc sql;
create table train_join as 
select s.variable, s.n, s.MIN, s.P1, s.P5, s.P10, s.Q1, s.P50, S.Q3, s.P90, s.P95, s.P99, s.MAX, s.MEAN, s.STDDEV, s.QRANGE, s.RANGE, o.lb_mid, o.ub_mid, o.lb_extr, o.ub_extr   
from summary_train s, outlier_train o
where s.variable = o.variable;
Now I want to create a variable representing count of outliers which:
are lower than lb_mid
are higher than up_mid
are lower than lb_extr
are higher than up_extr
I would also like to create variable mild_outliers (true/false) and extrr_outliers (true/false)
Could you please give me some tips?
Kind regards,
SAS Employee



You can use another nice procedure to identify outliers, the Univariate procedure. It can save the outliers in a data set and also the statistics to determine the outliers. I share you an example and links to the documentation. 


ods output ExtremeObs=outliers;
proc univariate data=sashelp.class outtable=work.all ;
	var Height age;

Hope that this can help you.



Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg



Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 2 in conversation