BookmarkSubscribeRSS Feed
Theo_Gh
Obsidian | Level 7

Hello, 

I have an accounting data and looking forward to winsorizing the variables. I checked and there seems to be a number of ways to do it. 

Please is there a most efficient and generally acceptable way of winsorizing accounting data with a code or macro ?

 

Thanks

7 REPLIES 7
PGStats
Opal | Level 21

Check the WINSORIZED option in proc univariate.

PG
Theo_Gh
Obsidian | Level 7
Sounds silly but how can I find that ?
PeterClemmensen
Tourmaline | Level 20

In the PROC UNIVARIATE Documentation and as an example in the first article that I link to in my previous post 🙂

Theo_Gh
Obsidian | Level 7

Please bear with me. I have a dataset with variables like div size lev roa etc which I want to winsorize. I have gvkey as the main identifier.

I have tried using the code below but to no avail. I'm sure I'm not filling out the code well. Please note that this is a sample code I'm trying to use. Your help is very much appreciated!

The macro is :

%macro winsor(dsetin=, dsetout=, byvar=none, vars=, type=winsor, pctl=1 99);

%if &dsetout = %then %let dsetout = &dsetin;
    
%let varL=;
%let varH=;
%let xn=1;

%do %until ( %scan(&vars,&xn)= );
    %let token = %scan(&vars,&xn);
    %let varL = &varL &token.L;
    %let varH = &varH &token.H;
    %let xn=%EVAL(&xn + 1);
%end;

%let xn=%eval(&xn-1);

data xtemp;
    set &dsetin;
    run;

%if &byvar = none %then %do;

    data xtemp;
        set xtemp;
        xbyvar = 1;
        run;

    %let byvar = xbyvar;

%end;

proc sort data = xtemp;
    by &byvar;
    run;

proc univariate data = xtemp noprint;
    by &byvar;
    var &vars;
    output out = xtemp_pctl PCTLPTS = &pctl PCTLPRE = &vars PCTLNAME = L H;
    run;

data &dsetout;
    merge xtemp xtemp_pctl;
    by &byvar;
    array trimvars{&xn} &vars;
    array trimvarl{&xn} &varL;
    array trimvarh{&xn} &varH;

    do xi = 1 to dim(trimvars);

        %if &type = winsor %then %do;
            if not missing(trimvars{xi}) then do;
              if (trimvars{xi} < trimvarl{xi}) then trimvars{xi} = trimvarl{xi};
              if (trimvars{xi} > trimvarh{xi}) then trimvars{xi} = trimvarh{xi};
            end;
        %end;

        %else %do;
            if not missing(trimvars{xi}) then do;
              if (trimvars{xi} < trimvarl{xi}) then delete;
              if (trimvars{xi} > trimvarh{xi}) then delete;
            end;
        %end;

    end;
    drop &varL &varH xbyvar xi;
    run;

%mend winsor;


Ple 

Reeza
Super User

If you're just starting to learn SAS do not try to learn macros at first. Figure things out in a data step first. You'll get to macro and they definitely help but you likely won't be able to modify them or even really test them to ensure they're correct at this point.

 

The code you've shown is macro code, sort of like a function in R, but you haven't actually called it yet. You need to execute the macro and pass the correct parameters in.

 

Here's a very simplified example that calculates the winsorized mean.

 

%macro winsor_mean(dset=, var=, winsor=);

	proc univariate data=&dset winsor=5;
		var &var;
	run;

%mend;

%winsor_mean(dset=sashelp.class, var=weight);

 

Theo_Gh
Obsidian | Level 7
Thank you very much
PeterClemmensen
Tourmaline | Level 20

Rick Wicklin has two great articles on the subject

 

How to Winsorize data in SAS

 

and

 

Winsorization: The good, the bad, and the ugly

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1906 views
  • 6 likes
  • 4 in conversation