DATA Step, Macro, Functions and more

Winsorization

Reply
Contributor
Posts: 50

Winsorization

Hello, 

I have an accounting data and looking forward to winsorizing the variables. I checked and there seems to be a number of ways to do it. 

Please is there a most efficient and generally acceptable way of winsorizing accounting data with a code or macro ?

 

Thanks

Esteemed Advisor
Posts: 5,401

Re: Winsorization

Check the WINSORIZED option in proc univariate.

PG
Contributor
Posts: 50

Re: Winsorization

Sounds silly but how can I find that ?
PROC Star
Posts: 1,190

Re: Winsorization

In the PROC UNIVARIATE Documentation and as an example in the first article that I link to in my previous post Smiley Happy

Contributor
Posts: 50

Re: Winsorization

Please bear with me. I have a dataset with variables like div size lev roa etc which I want to winsorize. I have gvkey as the main identifier.

I have tried using the code below but to no avail. I'm sure I'm not filling out the code well. Please note that this is a sample code I'm trying to use. Your help is very much appreciated!

The macro is :

%macro winsor(dsetin=, dsetout=, byvar=none, vars=, type=winsor, pctl=1 99);

%if &dsetout = %then %let dsetout = &dsetin;
    
%let varL=;
%let varH=;
%let xn=1;

%do %until ( %scan(&vars,&xn)= );
    %let token = %scan(&vars,&xn);
    %let varL = &varL &token.L;
    %let varH = &varH &token.H;
    %let xn=%EVAL(&xn + 1);
%end;

%let xn=%eval(&xn-1);

data xtemp;
    set &dsetin;
    run;

%if &byvar = none %then %do;

    data xtemp;
        set xtemp;
        xbyvar = 1;
        run;

    %let byvar = xbyvar;

%end;

proc sort data = xtemp;
    by &byvar;
    run;

proc univariate data = xtemp noprint;
    by &byvar;
    var &vars;
    output out = xtemp_pctl PCTLPTS = &pctl PCTLPRE = &vars PCTLNAME = L H;
    run;

data &dsetout;
    merge xtemp xtemp_pctl;
    by &byvar;
    array trimvars{&xn} &vars;
    array trimvarl{&xn} &varL;
    array trimvarh{&xn} &varH;

    do xi = 1 to dim(trimvars);

        %if &type = winsor %then %do;
            if not missing(trimvars{xi}) then do;
              if (trimvars{xi} < trimvarl{xi}) then trimvars{xi} = trimvarl{xi};
              if (trimvars{xi} > trimvarh{xi}) then trimvars{xi} = trimvarh{xi};
            end;
        %end;

        %else %do;
            if not missing(trimvars{xi}) then do;
              if (trimvars{xi} < trimvarl{xi}) then delete;
              if (trimvars{xi} > trimvarh{xi}) then delete;
            end;
        %end;

    end;
    drop &varL &varH xbyvar xi;
    run;

%mend winsor;


Ple 

Super User
Posts: 22,857

Re: Winsorization

If you're just starting to learn SAS do not try to learn macros at first. Figure things out in a data step first. You'll get to macro and they definitely help but you likely won't be able to modify them or even really test them to ensure they're correct at this point.

 

The code you've shown is macro code, sort of like a function in R, but you haven't actually called it yet. You need to execute the macro and pass the correct parameters in.

 

Here's a very simplified example that calculates the winsorized mean.

 

%macro winsor_mean(dset=, var=, winsor=);

	proc univariate data=&dset winsor=5;
		var &var;
	run;

%mend;

%winsor_mean(dset=sashelp.class, var=weight);

 

Contributor
Posts: 50

Re: Winsorization

Thank you very much
PROC Star
Posts: 1,190

Re: Winsorization

Rick Wicklin has two great articles on the subject

 

How to Winsorize data in SAS

 

and

 

Winsorization: The good, the bad, and the ugly

Ask a Question
Discussion stats
  • 7 replies
  • 271 views
  • 6 likes
  • 4 in conversation