Hi SAS Experts -
I have a macro that calculates acceptable range of a variable. An acceptable range is defined by :
Lower Limit = Q1 - 1.5*(Q3-Q1)
Upper Limit = Q3 + 1.5*(Q3-Q1)
It's a boxplot method of calculating outliers. The macro is working fine. But it is inefficient in terms of its processing as it calculates outliers for each variable in a loop and then capping values. I want proc univariate to be run for all the variables (not in loop) and save output in a dataset and then capping for variables using IF THEN at one time only.
Code : -
options mprint symbolgen;
%macro outliers(input=, vars=, output= );
data &output;
set &input;
run;
%let n=%sysfunc(countw(&vars));
%do i= 1 %to &n;
%let val = %scan(&vars,&i);
/* Calculate the quartiles and inter-quartile range using proc univariate */
proc univariate data=&output noprint;
var &val;
output out=temp QRANGE= IQR Q1= First_Qtl Q3= Third_Qtl;
run;
/* Extract the upper and lower limits into macro variables */
data _null_;
set temp;
call symput('QR', IQR);
call symput('Q1', First_Qtl);
call symput('Q3', Third_Qtl);
run;
%let ULimit=%sysevalf(&Q3 + 1.5 * &QR);
%let LLimit=%sysevalf(&Q1 - 1.5 * &QR);
/* Final dataset excluding outliers*/
data &output;
set &output;
if &val < &Llimit then &val = &Llimit;
if &val > &Ulimit then &val = &Ulimit;
run;
%end;
%mend;
%outliers(Input=abcd, Vars = a, output= test);
o
Thanks in anticipation!
If we are using univariate we could calculate the statistics of only one variable at a time. So the method you are following is correct.
SAS procs are designed to handle a lot of variables in one run on the data.
Even the outputdataset of Univariate Base SAS(R) 9.4 Procedures Guide: Statistical Procedures, Second Edition It ends with mentioning two variables being processed each getting different suffixes.
You are starting your question with "you are having a macro" There could be a confusion there.
Macro-s in SAS are source text base ones (type 2) and not functional processes (type 3) or recorded keyboard/mouse actions (type 1). If you are trying to use macro-s as you used that with that word in Excel or any other programming language leave that behind you and program your logic understanding SAS from scratch http://en.wikipedia.org/wiki/Macro_(computer_science)
Just another question as you are trying to build boxplots http://nesug.org/proceedings/nesug08/np/np16.pdf
Did you check whether your question is easily solved by the many procs (modern word would be now packages) that are around .
For part 1 switch to proc means. I don't think you need a macro for this at all, but perhaps Jaap's paper has illustrated it better than I can here.
Here's two different ways of generating the stats for all variables at once:
proc means data=sashelp.class stackods n p25 p75 qrange median;
var age weight height;
ods output summary=want1;
output out=want2 n= p25= p75= qrange= median=/autoname;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.