BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MsGeritO
Obsidian | Level 7

Dear all,

 

I have about 30 variables for which I amend outliers beyond the 97-percentile to the value of the individual 97-percentile cut-off. Identifying the 97-percentile I use proc univariate. The variables are all some kind of income variables and an income of zero should be excluded from the procedure.

 

Of course, I could add (where var NE 0) in the data selection, but I would rather not want to repeat the code 30+ times.

 

Can anybody help? My internet research hasn't delivered any solution.

 

Thank you very much in advance.

Gerit

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @MsGeritO,

 

You can also include the DATA step into the macro where you "cap" the variables at their respective 97th percentiles:

%macro capvar(data=,     /* input dataset */
              out=,      /* output dataset */
              varlist=,  /* list of numeric variables to be capped */
              pctl=      /* cut-off percentile (e.g. 97) */
              );
%local i nv var;
%let nv=%sysfunc(countw(&varlist));

/* Compute percentiles */
%do i=1 %to &nv;
  %let var=%scan(&varlist,&i);
  proc univariate data=&data noprint;
  where &var ne 0;
  var &var;
  output out=_p&i pctlpts=&pctl pctlpre=_ pctlname=pctl;
  run;
%end;

/* Write percentiles to macro variables */
data _null_;
set _p1-_p&nv;
call symputx(cat('_p',_n_),_pctl,'L');
run;

/* Create output dataset */
data &out;
set &data;
%do i=1 %to &nv;
  %let var=%scan(&varlist,&i);
  if &var>&&_p&i then &var=&&_p&i;
%end;
run;
%mend capvar;

Example call of the macro:

%capvar(data=sashelp.heart, out=want, varlist=Weight Diastolic Systolic, pctl=97);

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

A macro would work here. Example:

 

%macro dothis;
    proc contents data=have noprint out=_contents_;
    run;
    proc sql noprint;
        select name into :names separated by ' ' from _contents_;
    quit;
    %do i=1 %to %sqlobs;
        %let thisvar=%scan(&names,&i,%str( ));
          /* Add any other options to PROC UNIVARIATE that you want */
          proc univariate data=have(where=(&thisvar^=0));
               var &thisvar;
          run;
      %end;
%mend;
%dothis 

 

--
Paige Miller
MsGeritO
Obsidian | Level 7
Thank you Paige Miller for your help. I appreciate any pointers. I thought about macros, but I am not so firm here. So this is a good help.
FreelanceReinh
Jade | Level 19

Hello @MsGeritO,

 

You can also include the DATA step into the macro where you "cap" the variables at their respective 97th percentiles:

%macro capvar(data=,     /* input dataset */
              out=,      /* output dataset */
              varlist=,  /* list of numeric variables to be capped */
              pctl=      /* cut-off percentile (e.g. 97) */
              );
%local i nv var;
%let nv=%sysfunc(countw(&varlist));

/* Compute percentiles */
%do i=1 %to &nv;
  %let var=%scan(&varlist,&i);
  proc univariate data=&data noprint;
  where &var ne 0;
  var &var;
  output out=_p&i pctlpts=&pctl pctlpre=_ pctlname=pctl;
  run;
%end;

/* Write percentiles to macro variables */
data _null_;
set _p1-_p&nv;
call symputx(cat('_p',_n_),_pctl,'L');
run;

/* Create output dataset */
data &out;
set &data;
%do i=1 %to &nv;
  %let var=%scan(&varlist,&i);
  if &var>&&_p&i then &var=&&_p&i;
%end;
run;
%mend capvar;

Example call of the macro:

%capvar(data=sashelp.heart, out=want, varlist=Weight Diastolic Systolic, pctl=97);
MsGeritO
Obsidian | Level 7

Thank you. This does exactly what I needed and works perfectly.

ballardw
Super User

Temporary data set where you replace the 0 values with missing.

Something like:

data temp;
   set have;
   array inc <list of income variable names goes here>;
   do _i_ = 1 to dim(inc);
      if inc[_i_]=0 then [_i_]=.;
   end;
run;

And use that set for proc univariate to find your outliers.

MsGeritO
Obsidian | Level 7
Thank you ballardw for your post. It is so nice that people help. I thought about something similar, but refrained as I needed to code which values are "." because they are really missing and which are "." because they are zero, since this information is needed at a later state. Of course this is possible, too, just again more complicated 😉 Thank you nonetheless.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2574 views
  • 2 likes
  • 4 in conversation