BookmarkSubscribeRSS Feed
cezaryrosa
Calcite | Level 5
Hi All,,
 
I'm new to SAS and my task is to calculate shares of outliers for  my date. I have already prepared the following code but stopped after joining table.
 
Creating descriptive statistics
title 'Descriptive statistics for each variable from "valid" dataset';
ods exclude _all_;
proc means data=valid
           n min p1 p5 p10 q1 p50 q3 p90 p95 p99 max mean std qrange range
           STACKODSOUTPUT;  
var _numeric_ ;
ods output summary=summary_valid;
run;
ods exclude none;
 
Creating a table with outlier bounds data
proc sql;
create table outlier_valid as
select variable, Q1-1.5*QRANGE as lb_mid, Q3+1.5*QRANGE as ub_mid, Q1-3*QRANGE as lb_extr, Q3+3*QRANGE as ub_extr 
from summary_valid
order by variable desc, variable;
quit;
 
proc print data=outlier_valid noobs;
 title1 'The summary of outliers data valid';
  title2 'lb_mid = lower bound for mid outliers';
  title3 'ub_mid = upper bound for mid outliers';
  title4 'lb_extr = lower bound for extreme outliers';
  title5 'ub_extr = upper bound for extreme outliers';
run;
 
 
Joining two above tables
proc sql;
create table train_join as 
select s.variable, s.n, s.MIN, s.P1, s.P5, s.P10, s.Q1, s.P50, S.Q3, s.P90, s.P95, s.P99, s.MAX, s.MEAN, s.STDDEV, s.QRANGE, s.RANGE, o.lb_mid, o.ub_mid, o.lb_extr, o.ub_extr   
from summary_train s, outlier_train o
where s.variable = o.variable;
quit;
 
Now I want to create a variable representing count of outliers which:
are lower than lb_mid
are higher than up_mid
are lower than lb_extr
are higher than up_extr
 
I would also like to create variable mild_outliers (true/false) and extrr_outliers (true/false)
 
Could you please give me some tips?
Kind regards,
Cezary
 
1 REPLY 1
joseenrique1
SAS Employee

Hi, 

 

You can use another nice procedure to identify outliers, the Univariate procedure. It can save the outliers in a data set and also the statistics to determine the outliers. I share you an example and links to the documentation. https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=procstat&docsetTarget=pro... 

 

https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=procstat&docsetTarget=pro... 

 


ods output ExtremeObs=outliers;
proc univariate data=sashelp.class outtable=work.all ;
	var Height age;
run;

Hope that this can help you.

sas-innovate-2024.png

 

Secure your spot at the must-attend AI and analytics event of 2024: SAS Innovate 2024! Get ready for a jam-packed agenda featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.

 

Register by March 1 to snag the Early Bird rate of just $695! Don't miss out on this exclusive offer. 

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 782 views
  • 0 likes
  • 2 in conversation