Solved: Re: Replace missing value with previous value

Belle · Posted 07-16-2015 10:28 PM

Hello,

I want to get each id with non-missing y and z. but if I have highest x with non-missing y or z, then I will keep them.

My approach is if I can replace missing x and y to the previous no-missing value, then at the end, I can just keep last.id. However, I'm having difficulty to replace these missing value. I try lag, but doesn't work well.

I appreciate any help.

data temp;

input id x y z;

cards;

1 2 3 2

1 4 5 .

1 4 . 4

1 4 . 7

2 6 4 2

2 6 8 .

2 7 . 8

2 7 9 .

2 9 . .

3 4 2 3

3 4 4 .

3 7 . 9

3 7 . 6

4 2 . 1

4 3 4 2

;

run;

The result I expect is:

id x y z;

1 4 5 7

2 7 9 8

3 7 4 6

4 3 4 2

ChrisNZ · Posted 07-17-2015 06:01 AM

data HAVE;

input ID X Y Z;

cards;

1 2 3 2

1 4 5 .

1 4 . 4

1 4 . 7

2 6 4 2

2 6 8 .

2 7 . 8

2 7 9 .

2 9 . .

3 4 2 3

3 4 4 .

3 7 . 9

3 7 . 6

4 2 . 1

4 3 4 2

run;

data WANT;

merge HAVE(keep=ID X Y where=(Y ne .))

HAVE(keep=ID X Z where=(Z ne .));

by ID ;

if last.ID;

putlog ID X Y Z;

run;

1 4 5 7

2 7 9 8

3 7 4 6

4 3 4 2

High-Performance SAS Coding - Third Edition

View solution in original post

slchen · Posted 07-16-2015 11:17 PM

proc sql;

select id, max(x) as x,max(y) as y,max(z) as z from temp group by id;

quit;

Belle · Posted 07-17-2015 12:36 AM

sIchen -- Thanks for your reply. I'm not look for the max value on each column. I am looking for the non-missing value for y and z that has the latest x.

HarryLiu · Posted 07-17-2015 12:56 AM

Dear Belle,

You can try this code:

data have;

set temp;

if y ne . and z ne . then output;

run; *build a new data set with non-missing of y an z.*

proc sql;

select id max(x) as x, y, z *select the highest x from the data set*

from have

group by id;

quit;

proc print;

run;

Belle · Posted 07-17-2015 12:59 PM

HarryLiu,

I appreciate your help, I am not trying to get no missing fields. I want to replace these missing fields with previous value, so I can get the last.id information. But your information may help my future cases.

HarryLiu · Posted 07-17-2015 12:26 AM

Hi Belle,

To replace missing value, you can use retain statement.

The sample code is as following:

data want;

set temp;

retain new_y;

if missing (y) then new_y=x;

else new_y=y;

output;

run;

proc print;

run;

Best,

Harry Liu

stat_sas · Posted 07-17-2015 12:31 AM

data want(drop=i a:);

set temp;

by id;

retain a1-a3;

array t(*) a1-a3;

array v(*) x y z;

do i=1 to dim(v);

if v(i) ne . then t(i)=v(i);

v(i)=t(i);

end;

if last.id;

run;

Belle · Posted 07-17-2015 12:55 PM

Thanks stat@sas, your answer is pretty close, except for id2, x should be 7 instead of 9. if I delete missing y and z at the beginning, then it works.

ChrisNZ · Posted 07-17-2015 06:01 AM

data HAVE;

input ID X Y Z;

cards;

1 2 3 2

1 4 5 .

1 4 . 4

1 4 . 7

2 6 4 2

2 6 8 .

2 7 . 8

2 7 9 .

2 9 . .

3 4 2 3

3 4 4 .

3 7 . 9

3 7 . 6

4 2 . 1

4 3 4 2

run;

data WANT;

merge HAVE(keep=ID X Y where=(Y ne .))

HAVE(keep=ID X Z where=(Z ne .));

by ID ;

if last.ID;

putlog ID X Y Z;

run;

1 4 5 7

2 7 9 8

3 7 4 6

4 3 4 2

High-Performance SAS Coding - Third Edition

Belle · Posted 07-17-2015 01:00 PM

Thanks Chris, your answer works well.

ChrisNZ · Posted 07-19-2015 02:32 AM

No worries. It s the first time I see SAS's merge behaviour with multiple keys actually be useful!

High-Performance SAS Coding - Third Edition

viveklanka · Posted 07-17-2015 07:21 AM

You can do this way:

data want;

set have;

retain Y_new;

by ID;

if ID=lag(ID) then do;

if Y=. then Y=y_new;

else y_new=y;

end;

if last.ID then output want;

run;

Belle · Posted 07-17-2015 01:11 PM

viveklanka, thanks for you help.

Ksharp · Posted 07-17-2015 09:40 AM

Code: Program

data HAVE;
   input ID X Y Z;
   cards;
  1 2 3 2
  1 4 5 .
  1 4 . 4
  1 4 . 7
  2 6 4 2
  2 6 8 .
  2 7 . 8
  2 7 9 .
  2 9 . .
  3 4 2 3
  3 4 4 .
  3 7 . 9
  3 7 . 6
  4 2 . 1
  4 3 4 2
run;
data want;
 set have;
 by id;
 retain _x _y _z;
 if first.id then call missing(_x,_y,_z);
 if not missing(x) then _x=x;
 if not missing(y) then _y=y;
 if not missing(z) then _z=z;
 if last.id;
 drop x y z;
run;

Belle · Posted 07-17-2015 01:05 PM

Hi xia keshan,

Thanks for your helpful answer. your answer is same as stat@sas, problem on id2. But I can fix it by eliminate missing y and z first.

Code: Program

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away