BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Belle
Obsidian | Level 7

Hello,

I want to get each id with non-missing y and z. but if I have highest x with non-missing y or z, then I will keep them.

My approach is if I can replace missing x and y to the previous no-missing value, then at the end, I can just keep last.id. However, I'm having difficulty to replace these missing value. I try lag, but doesn't work well.

I appreciate any help.

data temp;

   input id x y z;

   cards;

   1 2 3 2

   1 4 5 .

   1 4 . 4

   1 4 . 7

   2 6 4 2

   2 6 8 .

   2 7 . 8

   2 7 9 .

   2 9 . .

   3 4 2 3

   3 4 4 .

   3 7 . 9

   3 7 . 6

   4 2 . 1

   4 3 4 2

   ;

run;

The result I expect is:

id    x     y     z;

1    4     5      7

2    7     9      8

3    7     4      6

4    3     4      2

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

data HAVE;

   input ID X Y Z;

   cards;

   1 2 3 2

   1 4 5 .

   1 4 . 4

   1 4 . 7

   2 6 4 2

   2 6 8 .

   2 7 . 8

   2 7 9 .

   2 9 . .

   3 4 2 3

   3 4 4 .

   3 7 . 9

   3 7 . 6

   4 2 . 1

   4 3 4 2

run;

data WANT;

  merge HAVE(keep=ID X Y where=(Y ne .))

       HAVE(keep=ID X Z where=(Z ne .));

  by ID ;

if last.ID;

putlog ID X Y Z;

run;

1 4 5 7

2 7 9 8

3 7 4 6

4 3 4 2

View solution in original post

17 REPLIES 17
slchen
Lapis Lazuli | Level 10

proc sql;

   select id, max(x) as x,max(y) as y,max(z) as z from temp group by id;

   quit;

Belle
Obsidian | Level 7

sIchen -- Thanks for your reply. I'm not look for the max value on each column. I am looking for the non-missing value for y and z that has the latest x.

HarryLiu
Obsidian | Level 7

Dear Belle,

You can try this code:

data have;

set temp;

if y ne . and z ne . then output;

run; *build a new data set with non-missing of y an z.*

proc sql;

select id max(x) as x, y, z *select the highest x from the data set*

from have

group by id;

quit;

proc print;

run;

Belle
Obsidian | Level 7

HarryLiu,

I appreciate your help, I am not trying to get no missing fields. I want to replace these missing fields with previous value, so I can get the last.id information. But your information may help my future cases.

HarryLiu
Obsidian | Level 7

Hi Belle,

To replace missing value, you can use retain statement.

The sample code is as following:

data want;

set temp;

retain new_y;

if missing (y) then new_y=x;

else new_y=y;

output;

run;

proc print;

run;

Best,

Harry Liu

stat_sas
Ammonite | Level 13

data want(drop=i a:);

set temp;

by id;

retain a1-a3;

array t(*) a1-a3;

array v(*) x y z;

do i=1 to dim(v);

if v(i)  ne . then t(i)=v(i);

v(i)=t(i);

end;

if last.id;

run;

Belle
Obsidian | Level 7

Thanks stat@sas, your answer is pretty close, except for id2, x should be 7 instead of 9. if I delete missing y and z at the beginning, then it works.

ChrisNZ
Tourmaline | Level 20

data HAVE;

   input ID X Y Z;

   cards;

   1 2 3 2

   1 4 5 .

   1 4 . 4

   1 4 . 7

   2 6 4 2

   2 6 8 .

   2 7 . 8

   2 7 9 .

   2 9 . .

   3 4 2 3

   3 4 4 .

   3 7 . 9

   3 7 . 6

   4 2 . 1

   4 3 4 2

run;

data WANT;

  merge HAVE(keep=ID X Y where=(Y ne .))

       HAVE(keep=ID X Z where=(Z ne .));

  by ID ;

if last.ID;

putlog ID X Y Z;

run;

1 4 5 7

2 7 9 8

3 7 4 6

4 3 4 2

Belle
Obsidian | Level 7

Thanks Chris, your answer works well.

ChrisNZ
Tourmaline | Level 20

No worries. It s the first time I see SAS's merge behaviour with multiple keys actually be useful!

viveklanka
Fluorite | Level 6

You can do this way:

data want;

set have;

retain Y_new;

by ID;

if ID=lag(ID) then do;

if Y=. then Y=y_new;

      else y_new=y;

      end;

if last.ID then output want;

run;

Belle
Obsidian | Level 7

viveklanka, thanks for you help.

Ksharp
Super User

Code: Program

data HAVE;
   input ID X Y Z;
   cards;
  1 2 3 2
  1 4 5 .
  1 4 . 4
  1 4 . 7
  2 6 4 2
  2 6 8 .
  2 7 . 8
  2 7 9 .
  2 9 . .
  3 4 2 3
  3 4 4 .
  3 7 . 9
  3 7 . 6
  4 2 . 1
  4 3 4 2
run;
data want;
set have;
by id;
retain _x _y _z;
if first.id then call missing(_x,_y,_z);
if not missing(x) then _x=x;
if not missing(y) then _y=y;
if not missing(z) then _z=z;
if last.id;
drop x y z;
run;

Belle
Obsidian | Level 7

Hi xia keshan,

Thanks for your helpful answer. your answer is same as stat@sas, problem on id2. But I can fix it by eliminate missing y and z first. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 5160 views
  • 10 likes
  • 9 in conversation