BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MsGeritO
Obsidian | Level 7

Dear all,

 

I have about 30 variables for which I amend outliers beyond the 97-percentile to the value of the individual 97-percentile cut-off. Identifying the 97-percentile I use proc univariate. The variables are all some kind of income variables and an income of zero should be excluded from the procedure.

 

Of course, I could add (where var NE 0) in the data selection, but I would rather not want to repeat the code 30+ times.

 

Can anybody help? My internet research hasn't delivered any solution.

 

Thank you very much in advance.

Gerit

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @MsGeritO,

 

You can also include the DATA step into the macro where you "cap" the variables at their respective 97th percentiles:

%macro capvar(data=,     /* input dataset */
              out=,      /* output dataset */
              varlist=,  /* list of numeric variables to be capped */
              pctl=      /* cut-off percentile (e.g. 97) */
              );
%local i nv var;
%let nv=%sysfunc(countw(&varlist));

/* Compute percentiles */
%do i=1 %to &nv;
  %let var=%scan(&varlist,&i);
  proc univariate data=&data noprint;
  where &var ne 0;
  var &var;
  output out=_p&i pctlpts=&pctl pctlpre=_ pctlname=pctl;
  run;
%end;

/* Write percentiles to macro variables */
data _null_;
set _p1-_p&nv;
call symputx(cat('_p',_n_),_pctl,'L');
run;

/* Create output dataset */
data &out;
set &data;
%do i=1 %to &nv;
  %let var=%scan(&varlist,&i);
  if &var>&&_p&i then &var=&&_p&i;
%end;
run;
%mend capvar;

Example call of the macro:

%capvar(data=sashelp.heart, out=want, varlist=Weight Diastolic Systolic, pctl=97);

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

A macro would work here. Example:

 

%macro dothis;
    proc contents data=have noprint out=_contents_;
    run;
    proc sql noprint;
        select name into :names separated by ' ' from _contents_;
    quit;
    %do i=1 %to %sqlobs;
        %let thisvar=%scan(&names,&i,%str( ));
          /* Add any other options to PROC UNIVARIATE that you want */
          proc univariate data=have(where=(&thisvar^=0));
               var &thisvar;
          run;
      %end;
%mend;
%dothis 

 

--
Paige Miller
MsGeritO
Obsidian | Level 7
Thank you Paige Miller for your help. I appreciate any pointers. I thought about macros, but I am not so firm here. So this is a good help.
FreelanceReinh
Jade | Level 19

Hello @MsGeritO,

 

You can also include the DATA step into the macro where you "cap" the variables at their respective 97th percentiles:

%macro capvar(data=,     /* input dataset */
              out=,      /* output dataset */
              varlist=,  /* list of numeric variables to be capped */
              pctl=      /* cut-off percentile (e.g. 97) */
              );
%local i nv var;
%let nv=%sysfunc(countw(&varlist));

/* Compute percentiles */
%do i=1 %to &nv;
  %let var=%scan(&varlist,&i);
  proc univariate data=&data noprint;
  where &var ne 0;
  var &var;
  output out=_p&i pctlpts=&pctl pctlpre=_ pctlname=pctl;
  run;
%end;

/* Write percentiles to macro variables */
data _null_;
set _p1-_p&nv;
call symputx(cat('_p',_n_),_pctl,'L');
run;

/* Create output dataset */
data &out;
set &data;
%do i=1 %to &nv;
  %let var=%scan(&varlist,&i);
  if &var>&&_p&i then &var=&&_p&i;
%end;
run;
%mend capvar;

Example call of the macro:

%capvar(data=sashelp.heart, out=want, varlist=Weight Diastolic Systolic, pctl=97);
MsGeritO
Obsidian | Level 7

Thank you. This does exactly what I needed and works perfectly.

ballardw
Super User

Temporary data set where you replace the 0 values with missing.

Something like:

data temp;
   set have;
   array inc <list of income variable names goes here>;
   do _i_ = 1 to dim(inc);
      if inc[_i_]=0 then [_i_]=.;
   end;
run;

And use that set for proc univariate to find your outliers.

MsGeritO
Obsidian | Level 7
Thank you ballardw for your post. It is so nice that people help. I thought about something similar, but refrained as I needed to code which values are "." because they are really missing and which are "." because they are zero, since this information is needed at a later state. Of course this is possible, too, just again more complicated 😉 Thank you nonetheless.

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1951 views
  • 2 likes
  • 4 in conversation