BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Belle
Obsidian | Level 7

Hello,

I want to get each id with non-missing y and z. but if I have highest x with non-missing y or z, then I will keep them.

My approach is if I can replace missing x and y to the previous no-missing value, then at the end, I can just keep last.id. However, I'm having difficulty to replace these missing value. I try lag, but doesn't work well.

I appreciate any help.

data temp;

   input id x y z;

   cards;

   1 2 3 2

   1 4 5 .

   1 4 . 4

   1 4 . 7

   2 6 4 2

   2 6 8 .

   2 7 . 8

   2 7 9 .

   2 9 . .

   3 4 2 3

   3 4 4 .

   3 7 . 9

   3 7 . 6

   4 2 . 1

   4 3 4 2

   ;

run;

The result I expect is:

id    x     y     z;

1    4     5      7

2    7     9      8

3    7     4      6

4    3     4      2

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

data HAVE;

   input ID X Y Z;

   cards;

   1 2 3 2

   1 4 5 .

   1 4 . 4

   1 4 . 7

   2 6 4 2

   2 6 8 .

   2 7 . 8

   2 7 9 .

   2 9 . .

   3 4 2 3

   3 4 4 .

   3 7 . 9

   3 7 . 6

   4 2 . 1

   4 3 4 2

run;

data WANT;

  merge HAVE(keep=ID X Y where=(Y ne .))

       HAVE(keep=ID X Z where=(Z ne .));

  by ID ;

if last.ID;

putlog ID X Y Z;

run;

1 4 5 7

2 7 9 8

3 7 4 6

4 3 4 2

View solution in original post

17 REPLIES 17
slchen
Lapis Lazuli | Level 10

proc sql;

   select id, max(x) as x,max(y) as y,max(z) as z from temp group by id;

   quit;

Belle
Obsidian | Level 7

sIchen -- Thanks for your reply. I'm not look for the max value on each column. I am looking for the non-missing value for y and z that has the latest x.

HarryLiu
Obsidian | Level 7

Dear Belle,

You can try this code:

data have;

set temp;

if y ne . and z ne . then output;

run; *build a new data set with non-missing of y an z.*

proc sql;

select id max(x) as x, y, z *select the highest x from the data set*

from have

group by id;

quit;

proc print;

run;

Belle
Obsidian | Level 7

HarryLiu,

I appreciate your help, I am not trying to get no missing fields. I want to replace these missing fields with previous value, so I can get the last.id information. But your information may help my future cases.

HarryLiu
Obsidian | Level 7

Hi Belle,

To replace missing value, you can use retain statement.

The sample code is as following:

data want;

set temp;

retain new_y;

if missing (y) then new_y=x;

else new_y=y;

output;

run;

proc print;

run;

Best,

Harry Liu

stat_sas
Ammonite | Level 13

data want(drop=i a:);

set temp;

by id;

retain a1-a3;

array t(*) a1-a3;

array v(*) x y z;

do i=1 to dim(v);

if v(i)  ne . then t(i)=v(i);

v(i)=t(i);

end;

if last.id;

run;

Belle
Obsidian | Level 7

Thanks stat@sas, your answer is pretty close, except for id2, x should be 7 instead of 9. if I delete missing y and z at the beginning, then it works.

ChrisNZ
Tourmaline | Level 20

data HAVE;

   input ID X Y Z;

   cards;

   1 2 3 2

   1 4 5 .

   1 4 . 4

   1 4 . 7

   2 6 4 2

   2 6 8 .

   2 7 . 8

   2 7 9 .

   2 9 . .

   3 4 2 3

   3 4 4 .

   3 7 . 9

   3 7 . 6

   4 2 . 1

   4 3 4 2

run;

data WANT;

  merge HAVE(keep=ID X Y where=(Y ne .))

       HAVE(keep=ID X Z where=(Z ne .));

  by ID ;

if last.ID;

putlog ID X Y Z;

run;

1 4 5 7

2 7 9 8

3 7 4 6

4 3 4 2

Belle
Obsidian | Level 7

Thanks Chris, your answer works well.

ChrisNZ
Tourmaline | Level 20

No worries. It s the first time I see SAS's merge behaviour with multiple keys actually be useful!

viveklanka
Fluorite | Level 6

You can do this way:

data want;

set have;

retain Y_new;

by ID;

if ID=lag(ID) then do;

if Y=. then Y=y_new;

      else y_new=y;

      end;

if last.ID then output want;

run;

Belle
Obsidian | Level 7

viveklanka, thanks for you help.

Ksharp
Super User

Code: Program

data HAVE;
   input ID X Y Z;
   cards;
  1 2 3 2
  1 4 5 .
  1 4 . 4
  1 4 . 7
  2 6 4 2
  2 6 8 .
  2 7 . 8
  2 7 9 .
  2 9 . .
  3 4 2 3
  3 4 4 .
  3 7 . 9
  3 7 . 6
  4 2 . 1
  4 3 4 2
run;
data want;
set have;
by id;
retain _x _y _z;
if first.id then call missing(_x,_y,_z);
if not missing(x) then _x=x;
if not missing(y) then _y=y;
if not missing(z) then _z=z;
if last.id;
drop x y z;
run;

Belle
Obsidian | Level 7

Hi xia keshan,

Thanks for your helpful answer. your answer is same as stat@sas, problem on id2. But I can fix it by eliminate missing y and z first. 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 17 replies
  • 6040 views
  • 10 likes
  • 9 in conversation