DATA Step, Macro, Functions and more

Array Imputation with Aggregate

Accepted Solution Solved
Reply
Contributor
Posts: 64
Accepted Solution

Array Imputation with Aggregate

I love SAS for it's arrays. I use it often to make imputations like this:  (change missing to the value 75)

 

data want; set have;
   array change [*] x1-x999;
            do over change;
            	if change=. then change=75;
            end;
run;

But what if, instead of changing to 75, I wanted impute to the minimum value of x. 

 

Thinking about this hurts my brain because I know that array is moving "sideways" and I'm looking for a whole dataset aggregation to obtain the minimum. 

 

I'm sure I could hack someting together, but I'm really worried about effiecency due to my dataset size.


Accepted Solutions
Solution
‎07-07-2016 11:00 PM
Super User
Posts: 19,870

Re: Array Imputation with Aggregate

I'm assuming your thinking of going column by column? Or is it min across all X, across all observations? 

 

You should take a look at Proc stdize with missing and replace options. 

View solution in original post


All Replies
Super Contributor
Posts: 298

Re: Array Imputation with Aggregate

Why not?

 

We can use the MINIMUM function with array.

 

Since DO OVER is deprecated, I use the usual way.

 

Here is the code.

data have;
input x1 x2 x3 x4 x5;
datalines;
10 12 11  3 10
 3  7 10  .  5
14  . 20  1  3
;
run;


data want;
   set have;
   array change[*] x1-x5;
   do i = 1 to dim(change);
      min = min(of change[*]);
      if change[i] = . then change[i] = min;
   end;
keep x:;
run;
Solution
‎07-07-2016 11:00 PM
Super User
Posts: 19,870

Re: Array Imputation with Aggregate

I'm assuming your thinking of going column by column? Or is it min across all X, across all observations? 

 

You should take a look at Proc stdize with missing and replace options. 

Respected Advisor
Posts: 4,934

Re: Array Imputation with Aggregate

It can be done in a single data step:

 

data test;
set sashelp.class;
if age = 13 then call missing(height, weight);
run;

data testi;
if 0 then set test;
array _x {*} _numeric_;
array _m {9999} _temporary_;
do while(not endmin);
    set test end=endmin;
    do i = 1 to dim(_x);
        _m{i} = min(_m{i}, _x{i});
        end;
    end;
do while(not endimp);
    set test end=endimp;
    do i = 1 to dim(_x);
        if missing(_x{i}) then _x{i} = _m{i};
        end;
    output;
    end;
drop i;
stop;
run;
    
PG
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 277 views
  • 0 likes
  • 4 in conversation