BookmarkSubscribeRSS Feed
cezaryrosa
Calcite | Level 5
Hi All,,
 
I'm new to SAS and my task is to calculate shares of outliers for  my date. I have already prepared the following code but stopped after joining table.
 
Creating descriptive statistics
title 'Descriptive statistics for each variable from "valid" dataset';
ods exclude _all_;
proc means data=valid
           n min p1 p5 p10 q1 p50 q3 p90 p95 p99 max mean std qrange range
           STACKODSOUTPUT;  
var _numeric_ ;
ods output summary=summary_valid;
run;
ods exclude none;
 
Creating a table with outlier bounds data
proc sql;
create table outlier_valid as
select variable, Q1-1.5*QRANGE as lb_mid, Q3+1.5*QRANGE as ub_mid, Q1-3*QRANGE as lb_extr, Q3+3*QRANGE as ub_extr 
from summary_valid
order by variable desc, variable;
quit;
 
proc print data=outlier_valid noobs;
 title1 'The summary of outliers data valid';
  title2 'lb_mid = lower bound for mid outliers';
  title3 'ub_mid = upper bound for mid outliers';
  title4 'lb_extr = lower bound for extreme outliers';
  title5 'ub_extr = upper bound for extreme outliers';
run;
 
 
Joining two above tables
proc sql;
create table train_join as 
select s.variable, s.n, s.MIN, s.P1, s.P5, s.P10, s.Q1, s.P50, S.Q3, s.P90, s.P95, s.P99, s.MAX, s.MEAN, s.STDDEV, s.QRANGE, s.RANGE, o.lb_mid, o.ub_mid, o.lb_extr, o.ub_extr   
from summary_train s, outlier_train o
where s.variable = o.variable;
quit;
 
Now I want to create a variable representing count of outliers which:
are lower than lb_mid
are higher than up_mid
are lower than lb_extr
are higher than up_extr
 
I would also like to create variable mild_outliers (true/false) and extrr_outliers (true/false)
 
Could you please give me some tips?
Kind regards,
Cezary
 
1 REPLY 1
joseenrique1
SAS Employee

Hi, 

 

You can use another nice procedure to identify outliers, the Univariate procedure. It can save the outliers in a data set and also the statistics to determine the outliers. I share you an example and links to the documentation. https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=procstat&docsetTarget=pro... 

 

https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=procstat&docsetTarget=pro... 

 


ods output ExtremeObs=outliers;
proc univariate data=sashelp.class outtable=work.all ;
	var Height age;
run;

Hope that this can help you.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 1321 views
  • 0 likes
  • 2 in conversation