BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Tpham
Quartz | Level 8

So I am writing a code and I seem to be stuck. I have been searching through google and haven't been able to find an answer. So I am hoping someone here can advise me.

I am trying to pick the last non-missing value.

This is my data

data has;

   input ID RX Time score;

   datalines;                     

1    1    6     10

1    1     12    50

1    1    15    60

2    1    6    10

2    1    12    50

2    1    15    .

;

I want to subset the data to pick only last time point that does not have a missing score. So for ID 1, I want to pick the record at time 15 since it is the last non-missing score. and for ID 2 i want to pick the record with time 12 since the score is missing at time 15.

The code I have come up with so far can only pick the last ID if it is not missing:

PROC SORT DATA=has;

BY ID Time ;

RUN;

DATA want;

SET has;

BY ID Time ;

if last.id and score NE . then select=1;

RUN;

What else do I have to do to get my program to work?


1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

try sorting the data by descending time so the non missing one is at the top:

proc sort data=has out=want;

  by id descending time;

  where score ne .;

run;

data want;

  set want;

  by id;

  if first.id then output;

run;

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

try sorting the data by descending time so the non missing one is at the top:

proc sort data=has out=want;

  by id descending time;

  where score ne .;

run;

data want;

  set want;

  by id;

  if first.id then output;

run;

Kurt_Bremser
Super User

You might try this:

proc sort data=has;

by id descending time;

run;

data want;

set has;

by id;

retain select;

if first.id then select = 1;

if select = 1 and score ne .

then do;

  select = 0;

  output;

end;

run;

RW9 beat me with a more elegant solution!

art297
Opal | Level 21

If your data are already sorted than you could just use:

data want;

  set has (where=(not missing(score)));

  by id;

  if last.id;

run;

Tom
Super User Tom
Super User

Using WHERE instead of IF will make the LAST.ID variable work properly.


data want; set have; by id; where score is not null ; if last.id; run;

What do you want to do if all SCORE values for a particular ID are missing?  If you want them eliminated then that will do it.

If you want to keep them then you might need to work a little harder.

data want ;

  do _n_=1 by 1 until(last.id); set have; by id; if not missing(score) then keep=_n_ ; end;

  keep = min(keep,_n_);

  do _n_=1 by 1 until(last.id); set have; by id; if _n_=keep then output; end;

run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 17766 views
  • 0 likes
  • 5 in conversation