So I am writing a code and I seem to be stuck. I have been searching through google and haven't been able to find an answer. So I am hoping someone here can advise me.
I am trying to pick the last non-missing value.
This is my data
data has;
input ID RX Time score;
datalines;
1 1 6 10
1 1 12 50
1 1 15 60
2 1 6 10
2 1 12 50
2 1 15 .
;
I want to subset the data to pick only last time point that does not have a missing score. So for ID 1, I want to pick the record at time 15 since it is the last non-missing score. and for ID 2 i want to pick the record with time 12 since the score is missing at time 15.
The code I have come up with so far can only pick the last ID if it is not missing:
PROC SORT DATA=has;
BY ID Time ;
RUN;
DATA want;
SET has;
BY ID Time ;
if last.id and score NE . then select=1;
RUN;
What else do I have to do to get my program to work?
try sorting the data by descending time so the non missing one is at the top:
proc sort data=has out=want;
by id descending time;
where score ne .;
run;
data want;
set want;
by id;
if first.id then output;
run;
try sorting the data by descending time so the non missing one is at the top:
proc sort data=has out=want;
by id descending time;
where score ne .;
run;
data want;
set want;
by id;
if first.id then output;
run;
You might try this:
proc sort data=has;
by id descending time;
run;
data want;
set has;
by id;
retain select;
if first.id then select = 1;
if select = 1 and score ne .
then do;
select = 0;
output;
end;
run;
RW9 beat me with a more elegant solution!
If your data are already sorted than you could just use:
data want;
set has (where=(not missing(score)));
by id;
if last.id;
run;
Using WHERE instead of IF will make the LAST.ID variable work properly.
data want; set have; by id; where score is not null ; if last.id; run;
What do you want to do if all SCORE values for a particular ID are missing? If you want them eliminated then that will do it.
If you want to keep them then you might need to work a little harder.
data want ;
do _n_=1 by 1 until(last.id); set have; by id; if not missing(score) then keep=_n_ ; end;
keep = min(keep,_n_);
do _n_=1 by 1 until(last.id); set have; by id; if _n_=keep then output; end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.