BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Forecaster
Obsidian | Level 7

I would like to create a winsorized mean across variable and store it as a new variable in the original data set.

Is there a way in SAS to get winsorized mean across variables ? Below is the code for generating input dataset.

data have;

  input var1 $ var var3;

datalines;

_1 10 12

_2 20 14

_3 30 16

_4 40 18

_5 50 20

_6 60 22

_7 70 24

_8 80 26

;

run;

I know proc IML has the facility to do winsorized mean for each column, I need to do something similar but for each row instead of column. The desired output dataset would have all the original variables: var1, var2, var3 and also the calculated winsorized mean for each row.

proc iml;

    use have;

    read all var _NUM_ into X;

    read all var _CHAR_ into C;

    close have;

    mean = mean(x[,],"winsorized", 1);

    create want from mean;

  append from mean;

quit;

thanks for your help

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Agree with Steve.

data have;
array var[10];
do i = 1 to 20;
   do j = 1 to dim(var);
      var{j} = rand("normal", i);
   end;
   output;
end;

proc iml;

use have;
read all var _NUM_ into X[c=varNames];
close have;

mean = mean(x`,"winsorized", 1);
newX = x || mean`;

varNames = varNames || "WinMean";
create want from newX[c=varNames];
append from newX;
close want;

View solution in original post

7 REPLIES 7
SteveDenham
Jade | Level 19

Can you read all of this into a matrix, then transpose the matrix, get the winsorized mean for each column (transposed row), append to the matrix, and then re-transponse?

Steve Denham

Rick_SAS
SAS Super FREQ

Agree with Steve.

data have;
array var[10];
do i = 1 to 20;
   do j = 1 to dim(var);
      var{j} = rand("normal", i);
   end;
   output;
end;

proc iml;

use have;
read all var _NUM_ into X[c=varNames];
close have;

mean = mean(x`,"winsorized", 1);
newX = x || mean`;

varNames = varNames || "WinMean";
create want from newX[c=varNames];
append from newX;
close want;

Forecaster
Obsidian | Level 7

Rick, perfect thanks so much. One additional question, what if I have a character variable in the original dataset, and wanted to append to the newly created dataset. Proc IML is not letting me mix both character and numeric variables.

Can you please let  me know how you would be able to do this.

Appreciate your response

Thanks so much

Reeza
Super User

Are you going into IML solely for Winsorized mean? If so, proc univariate can calculate the winsorized mean, similar process, transpose, calculate, transpose again.

Forecaster
Obsidian | Level 7

I have 120,000 observations, using a proc univariate to transpose the 120K observations would have performacne issues. That is why I was looking for Proc IML for Winsorized mean.

Reeza
Super User

How many vars do you have?  For 120K rows with 50 vars it processed in less than a minute for me.

The output comes in a column format, so its Transpose, Univariate then merge.

data random;

    array var(50) var1-var50;

    do ID=1 to 120000;

        do k=1 to 50;

            var(k)=rand('normal', 20, k);

        end;

    output;

    end;

    drop k;

run;

*Transpose;

proc transpose data=random out=step1 prefix=value;

by ID;

run;

ods listing close;

proc univariate data=step1 winsorized=0.1;

by ID;

var value1;

ods output winsorizedMeans=step2;

run;

Rick_SAS
SAS Super FREQ

The efficient way is to write ONLY the winsorized mean to a new data set and then use the DATA step to add the new column to the original data. This avoids reading all the data into IML just so that you can write it out again:

WinMean = mean(x`,"winsorized", 1);
  create want var {"WinMean"};
append;
close want;

quit;

data want;

merge have want;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 7 replies
  • 1464 views
  • 3 likes
  • 4 in conversation