How to flag the last AVALC = 'Y' record prior to this AVALC='N' record. For example as below:
I need to flag the third A record and second B record.
data have;
input usubjid $ ADY AVALC $;
datalines;
A 15 N
A 36 N
A 57 Y
A 60 N
A 61 Y
A 62 N
A 63 Y
A 64 Y
A 65 Y
B 15 Y
B 36 Y
B 57 N
B 58 N
B 60 Y
;
run;
Why not flagging record 5? It is a Y prior to N.
What is the rule the says that the N on the 4th record is the (only) N to consider for usubjid A? Examples without rules are incomplete.
Thank you for providing data in the data step.
Hi @SerenaJJ
A simple solution is to use a second set-statement to perform look-ahead.
The first data step flags all occurences of a Y followed ny N within the same ID. The next is a bit more complicated, because it is necessary to keep information of the flag having been set already.
data want (drop=nextavalc nextusubjid);
set have end=eof;
by usubjid;
if not eof then do;
set have (firstobs=2 keep=avalc usubjid rename=(avalc=nextavalc usubjid=nextusubjid));
if not eof and usubjid=nextusubjid and avalc='Y' and nextavalc='N' then Flag=1;
end;
run;
data want2 (drop=FlagSet nextavalc nextusubjid);
set have end=eof;
by usubjid;
retain FlagSet;
if first.usubjid then flagset = 0;
if not eof then do;
set have (firstobs=2 keep=avalc usubjid rename=(avalc=nextavalc usubjid=nextusubjid));
if not eof and usubjid=nextusubjid and FlagSet=0 and avalc='Y' and nextavalc='N' then do;
Flag=1;
FlagSet=1;
end;
end;
run;
The conditional execution of the do-block is needed to get the last observation written.
Rename of the variables from the second set statement prevents that the values from the first set statement is owerwritten in the program vector, so it is possible to compare the values from the corrent and the next observation.
The comparison of the ID's prevents the last observation with value Y to be flagged if the next observation has value N, but not the same ID.
I've edited this response to include 3 scenarios:
I'm a great fan of the capabilities of multiple SET statements, but I see this task as a single SET situation. If what you really want is to flag EVERY instance of the last Y to precede an N (3rd A, 5th A and 2nd B), then a single SET with a BY ... NOTSORTED does the trick:
data have;
input usubjid $ ADY AVALC $;
datalines;
A 15 N
A 36 N
A 57 Y
A 60 N
A 61 Y
A 62 N
A 63 Y
A 64 Y
A 65 Y
B 15 Y
B 36 Y
B 57 N
B 58 N
B 60 Y
;
run;
data want;
set have;
by usubjid avalc notsorted;
if avalc='Y' and last.avalc=1 and last.usubjid=0 then flag=1;
else flag=0;
run;
Now if your rule is to flag only the first qualifying observations for each ID (i.e. 3rd A and 2nd B), very little needs to be changed:
data want (drop=_:);
set have;
by usubjid avalc notsorted;
if first.usubjid=1 then _count=0;
if avalc='Y' and last.avalc=1 and last.usubjid=0 AND _count=0 then flag=1;
else flag=0;
_count+flag;
run;
Adding this 3rd scenario DOES require multiple sets:
data want (drop=_:);
set have;
by usubjid avalc notsorted;
retain _last_y ; /*Indicates Final instance of Y preceding an N */
if avalc='Y' and last.avalc=1 and last.usubjid=0 then _last_y=_n_;
if last.usubjid then do until (last.usubjid);
set have;
by usubjid;
_n+1;
if _n=_last_y then flag=1;
else flag=0;
output;
end;
run;
data have;
input usubjid $ ADY AVALC $;
datalines;
A 15 N
A 36 N
A 57 Y
A 60 N
A 61 Y
A 62 N
A 63 Y
A 64 Y
A 65 Y
B 15 Y
B 36 Y
B 57 N
B 58 N
B 60 Y
;
run;
data want;
set have;
by usubjid AVALC notsorted;
retain found 0;
if first.usubjid then found=0;
if last.AVALC and AVALC='Y' and not found then do;found=1;flag=1;end;
drop found;
run;
data have; input usubjid $ ADY AVALC $; datalines; A 15 Y A 36 N A 57 Y A 60 N A 61 Y A 62 N A 63 Y A 64 N A 65 Y B 15 Y B 36 Y B 57 N B 58 N B 60 Y ; run; data want; do until(last.usubjid); set have; by usubjid; row+1; if ^last.usubjid and avalc = 'Y' then do; next=row+1; set have (keep = avalc rename=(avalc = _avalc)) point=next; if _avalc = 'N' then last = ady; end; end; do until(last.usubjid); set have; by usubjid; if ady = last then flag = 'Y'; output; call missing(flag); end; call missing(last); drop _avalc row last; run;
Can you try this?
Thank you so much for the coding. It works!!
Thank you so much for all the suggestion and silutions you share with me!!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.