DATA Step, Macro, Functions and more

Pick The Last Non Missing Value

Reply
Contributor
Posts: 64

Pick The Last Non Missing Value

I have the following dataset:
X Y
1 70
1 .
1 .
1 .
1 80
1 .
1 .

I want to assign the last non missing value to the missing observations. Expected output as follows
X Y
1 70
1 70
1 70
1 70
1 80
1 80
1 80

Please help. I tried do end loop but to no use.

Contributor
Posts: 64

Re: Pick The Last Non Missing Value

Adding to the above requirment, I may also need to pick last non missing in a BY group. That is, I don't need to assign 80 if X=2.
Regular Contributor
Posts: 161

Re: Pick The Last Non Missing Value

Remove the rename and drop statements if you want the new variable to be in its own column.

 

data temp;
   input X Y;
   datalines;
1 70
1 .
1 .
1 .
1 80
1 .
1 .
;

data temp2;
    set temp;
    retain last_non_missing;
    
    if (not missing(Y)) then
        last_non_missing = Y;

    rename last_non_missing=Y;
    drop Y;
run;

Regular Contributor
Posts: 161

Re: Pick The Last Non Missing Value

with your updated requirement, is this what you want? note that the Y value when X is 3 will be .

 

data temp;
   input X Y;
   datalines;
1 70
1 .
1 .
1 .
1 80
1 .
1 .
2 60
2 .
3 .
;

data temp2;
    set temp;
    retain last_non_missing;
    
    if (first.x) then
        last_non_missing = .;

    if (not missing(Y)) then
        last_non_missing = Y;

    by x;
    rename last_non_missing=Y;
    drop Y;
run;
Respected Advisor
Posts: 3,777

Re: Pick The Last Non Missing Value

This is a good place to apply the UPDATE trick it handles BY group automatically.

 

data XY;
   input X Y @@;
   cards;
1 70 1 . 1 . 1 .
1 80 1 . 1 . 
2 60 2 . 3 .
;;;;
   run;
proc print;
   run;
data LOCF;
   update XY(obs=0) XY;
   by X;
   output;
   run;
proc print;
   run;
Ask a Question
Discussion stats
  • 4 replies
  • 261 views
  • 0 likes
  • 3 in conversation