BookmarkSubscribeRSS Feed
Riya88
Fluorite | Level 6

Hi SAS Experts -

I have a macro that calculates acceptable range of a variable. An acceptable range is defined by :

Lower Limit = Q1 - 1.5*(Q3-Q1)

Upper Limit = Q3 + 1.5*(Q3-Q1)

It's a boxplot method of calculating outliers. The macro is working fine. But it is inefficient in terms of its processing as it calculates outliers for each variable in a loop and then capping values. I want proc univariate to be run for all the variables (not in loop) and save output in a dataset and then capping for variables using IF THEN at one time only.

Code : -

options mprint symbolgen;

%macro outliers(input=, vars=, output= );

data &output;

set &input;

run;

%let n=%sysfunc(countw(&vars));

%do i= 1 %to &n;

%let val = %scan(&vars,&i);

/* Calculate the quartiles and inter-quartile range using proc univariate */

proc univariate data=&output noprint;

var &val;

output out=temp QRANGE= IQR Q1= First_Qtl Q3= Third_Qtl;

run;

/* Extract the upper and lower limits into macro variables */

data _null_;

set temp;

call symput('QR', IQR);

call symput('Q1', First_Qtl);

call symput('Q3', Third_Qtl);

run;

%let ULimit=%sysevalf(&Q3 + 1.5 * &QR);

%let LLimit=%sysevalf(&Q1 - 1.5 * &QR);

/* Final dataset excluding outliers*/

data &output;

set &output;

if &val < &Llimit then &val = &Llimit;

if &val > &Ulimit then &val = &Ulimit;

run;

%end;

%mend;

%outliers(Input=abcd, Vars = a, output= test);

o

Thanks in anticipation!

1 REPLY 1
Daniel-Santos
Obsidian | Level 7

Hi.

 

I guest you are looking for something like this:

 

%macro outliers(input=, vars=, output= );

%let nvars=%sysfunc(countw(&vars)); %do i= 1 %to &nvars; %let val = %scan(&vars,&i);
/* Calculate the quartiles and inter-quartile range using proc univariate */ proc univariate data=&input noprint; var &vars; * get qr, q1, q3 for each vars; output out=&output QRANGE= %do j=1 %to &nvars; IQR_&j %end; Q1= %do j=1 %to &nvars; First_Qtl_&j %end; Q3= %do j=1 %to &nvars; Third_Qtl_&j %end; ; run;
/* Extract the upper and lower limits into macro variables */ data _null_; set &output;
* get lower/upper limits for each vars; %do j=1 %to &nvars; call symput("Llimit_&j",put(First_Qtl_&j+1.5*IQR_&j,best.)); call symput("Ulimit_&j",put(Third_Qtl_&j+1.5*IQR_&j,best.)); %end; run;
/* Final dataset excluding outliers*/ data &output; set &input; * adjust limits for each vars; %do j=1 %to &nvars; %let var=%scan(&vars,&j,%str( )); %put Var=&var Llimit=&&Llimit_&j Ulimit=&&Ulimit_&j; if &var < &&Llimit_&j then &var = &&Llimit_&j; if &var > &&Ulimit_&j then &var = &&Ulimit_&j; %end; run; %end;

* get rid of temporaries;
proc datasets lib=work nolist;
delete _:;
quit; %mend; %outliers(Input=abcd, Vars = a b, output= test);

Hope it helps.

 

Daniel Santos @ www.cgd.pt

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 758 views
  • 0 likes
  • 2 in conversation