BookmarkSubscribeRSS Feed
cezaryrosa
Calcite | Level 5
Hi All,,
 
I'm new to SAS and my task is to calculate shares of outliers for  my date. I have already prepared the following code but stopped after joining table.
 
Creating descriptive statistics
title 'Descriptive statistics for each variable from "valid" dataset';
ods exclude _all_;
proc means data=valid
           n min p1 p5 p10 q1 p50 q3 p90 p95 p99 max mean std qrange range
           STACKODSOUTPUT;  
var _numeric_ ;
ods output summary=summary_valid;
run;
ods exclude none;
 
Creating a table with outlier bounds data
proc sql;
create table outlier_valid as
select variable, Q1-1.5*QRANGE as lb_mid, Q3+1.5*QRANGE as ub_mid, Q1-3*QRANGE as lb_extr, Q3+3*QRANGE as ub_extr 
from summary_valid
order by variable desc, variable;
quit;
 
proc print data=outlier_valid noobs;
 title1 'The summary of outliers data valid';
  title2 'lb_mid = lower bound for mid outliers';
  title3 'ub_mid = upper bound for mid outliers';
  title4 'lb_extr = lower bound for extreme outliers';
  title5 'ub_extr = upper bound for extreme outliers';
run;
 
 
Joining two above tables
proc sql;
create table train_join as 
select s.variable, s.n, s.MIN, s.P1, s.P5, s.P10, s.Q1, s.P50, S.Q3, s.P90, s.P95, s.P99, s.MAX, s.MEAN, s.STDDEV, s.QRANGE, s.RANGE, o.lb_mid, o.ub_mid, o.lb_extr, o.ub_extr   
from summary_train s, outlier_train o
where s.variable = o.variable;
quit;
 
Now I want to create a variable representing count of outliers which:
are lower than lb_mid
are higher than up_mid
are lower than lb_extr
are higher than up_extr
 
I would also like to create variable mild_outliers (true/false) and extrr_outliers (true/false)
 
Could you please give me some tips?
Kind regards,
Cezary
 
1 REPLY 1
joseenrique1
SAS Employee

Hi, 

 

You can use another nice procedure to identify outliers, the Univariate procedure. It can save the outliers in a data set and also the statistics to determine the outliers. I share you an example and links to the documentation. https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=procstat&docsetTarget=pro... 

 

https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=procstat&docsetTarget=pro... 

 


ods output ExtremeObs=outliers;
proc univariate data=sashelp.class outtable=work.all ;
	var Height age;
run;

Hope that this can help you.

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 1685 views
  • 0 likes
  • 2 in conversation