I am a new SAS learner, I can not use answers' program flexibly. For each ID, I want to replace the missing records before the first non-missing Treatment as 'Missing'; to replace missing records after the last non-missing Treatment as 'Discontinue'; to replace the missing records between the first non-missing Treatment and the last non-missing Treatment as 'Untrt'. Could you please give me a hand? Thank you in advance.
data have ;
input ID$ Month$ Treatment$ NEW_Treatment$ ;
cards;
1 Month_1 . Missing
1 Month_2 . Missing
1 Month_3 A A
1 Month_4 . Untrt
1 Month_5 B B
1 Month_6 . Untrt
1 Month_7 C C
1 Month_8 . Untrt
1 Month_9 . Untrt
1 Month_10 . Untrt
1 Month_11 E E
1 Month_12 . Discontinue
2 Month_1 B B
2 Month_2 C C
2 Month_3 . Untrt
2 Month_4 D D
2 Month_5 . Untrt
2 Month_6 . Untrt
2 Month_7 . Untrt
2 Month_8 . Untrt
2 Month_9 E E
2 Month_10 . Untrt
2 Month_11 . Untrt
2 Month_12 A A
3 Month_1 . Missing
3 Month_2 . Missing
3 Month_3 . Missing
3 Month_4 D D
3 Month_5 . Untrt
3 Month_6 . Untrt
3 Month_7 . Untrt
3 Month_8 . Untrt
3 Month_9 E E
3 Month_10 . Discontinue
3 Month_11 . Discontinue
3 Month_12 . Discontinue
;
run;
Sort in reverse chronological order to set the "Discontinue", then sort back into original order to set the other values:
data have ;
input ID$ Month Treatment$;
cards;
1 1 . Missing
1 2 . Missing
1 3 A A
1 4 . Untrt
1 5 B B
1 6 . Untrt
1 7 C C
1 8 . Untrt
1 9 . Untrt
1 10 . Untrt
1 11 E E
1 12 . Discontinue
2 1 B B
2 2 C C
2 3 . Untrt
2 4 D D
2 5 . Untrt
2 6 . Untrt
2 7 . Untrt
2 8 . Untrt
2 9 E E
2 10 . Untrt
2 11 . Untrt
2 12 A A
3 1 . Missing
3 2 . Missing
3 3 . Missing
3 4 D D
3 5 . Untrt
3 6 . Untrt
3 7 . Untrt
3 8 . Untrt
3 9 E E
3 10 . Discontinue
3 11 . Discontinue
3 12 . Discontinue
;
run;
proc sort data=have;
by id descending month;
run;
data int;
set have;
by id;
retain flag;
if first.id then flag = 1;
if not missing(treatment) then flag = 0;
else if flag then new_treatment = 'Discontinue';
run;
proc sort data=int;
by id month;
run;
data want;
set int;
by id;
retain flag;
if first.id then flag = 1;
if not missing(treatment)
then do;
flag = 0;
new_treatment = treatment;
end;
else if missing(new_treatment)
then do;
if flag
then new_treatment = 'Missing';
else new_treatment = 'Untrt';
end;
run;
Note that I made month numeric to facilitate correct sorting order. And that you will at least need a year column in real life application.
As posted under your original problem, here's a one -step method that assumes your data set is already in order:
data want;
ID_counter=0;
do until (last.id);
set have;
by ID;
ID_counter + 1;
if first_nonmissing = . and treatment > ' ' then first_nonmissing = ID_counter;
if treatment > ' ' then last_nonmissing = ID_counter;
end;
ID_counter = 0;
do until (last.ID);
set have;
by ID;
ID_counter + 1;
if treatment = ' ' then do;
if ID_counter < first_nonmissing then treatment = 'Missing';
else if ID_counter < last_nonmissing then treatment = 'Untrt';
else treatment = 'Discontinue';
end;
output;
end;
drop first_nonmissing last_nonmissing ID_counter;
run;
Note that if an ID has 100% missing values, they will become "Discontinue". The outcome is slightly different, compared to the post on your original question.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.