BookmarkSubscribeRSS Feed
SerenaJJ
Obsidian | Level 7

How to flag the last AVALC = 'Y' record prior to this AVALC='N' record. For example as below: 

I need to flag the third A record and second B record.

 

data have;
input usubjid $ ADY AVALC $;
datalines;
A 15 N
A 36 N
A 57 Y
A 60 N
A 61 Y
A 62 N
A 63 Y
A 64 Y
A 65 Y
B 15 Y
B 36 Y
B 57 N
B 58 N
B 60 Y
;
run;

14 REPLIES 14
ballardw
Super User

Why not flagging record 5? It is a Y prior to N.

What is the rule the says that the N on the 4th record is the (only) N to consider for usubjid A? Examples without rules are incomplete.

 

Thank you for providing data in the data step.

SerenaJJ
Obsidian | Level 7
Thank you so much for your suggestion. The specification is flagged the last AVALC = 'Y' record prior to this AVALC='N' record. If multiple records, then choose the first one. So for subjid level, only one record is kept.
Mazi
Pyrite | Level 9
Can you advise as to why row 5 would not be flagged? It satisfies your criteria. I presume this is being done at a subject level? Please confirm if that is the case.
SerenaJJ
Obsidian | Level 7
Thank you so much for your reply. The specification is flagged the last AVALC = 'Y' record prior to this AVALC='N' record. If multiple records, then choose the first one. So for subjid level, only one record is kept.
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @SerenaJJ 

 

A simple solution is to use a second set-statement to perform look-ahead. 

 

The first data step flags all occurences of a Y followed ny N within the same ID.  The next is a bit more complicated, because it is necessary to keep information of the flag having been set already.

 

data want (drop=nextavalc nextusubjid);
  set have end=eof;
  by usubjid;
  if not eof then do;
    set have (firstobs=2 keep=avalc usubjid rename=(avalc=nextavalc usubjid=nextusubjid));
    if not eof and usubjid=nextusubjid and avalc='Y' and nextavalc='N' then Flag=1;
  end;
run; 

data want2 (drop=FlagSet nextavalc nextusubjid);
  set have end=eof;
  by usubjid;
  retain FlagSet;
  if first.usubjid then flagset = 0;
  if not eof then do;
    set have (firstobs=2 keep=avalc usubjid rename=(avalc=nextavalc usubjid=nextusubjid));
    if not eof and usubjid=nextusubjid and FlagSet=0 and avalc='Y' and nextavalc='N' then do;
      Flag=1;
      FlagSet=1;
    end;
  end;
run; 

The conditional execution of the do-block is needed to get the last observation written.

Rename of the variables from the second set statement prevents that  the values from the  first set statement is owerwritten in the program vector, so it is possible to compare the values from the corrent and the next observation.

The comparison of the ID's prevents the last observation with value Y to be flagged if  the next observation has value N, but not the same ID.

 

 

SerenaJJ
Obsidian | Level 7
Thank you so much for the solution! It is great!!
mkeintz
PROC Star

I've edited this response to include 3 scenarios:

  1. Every instance of last Y preceding an N
  2. First instance.
  3. Last instance.

 

I'm a great fan of the capabilities of multiple SET statements, but I see this task as a single SET situation.  If what you really want is to flag EVERY instance of the last Y to precede an N (3rd A, 5th A and 2nd B), then a single SET with a BY ... NOTSORTED does the trick:

 

data have;
input usubjid $ ADY AVALC $;
datalines;
A 15 N
A 36 N
A 57 Y
A 60 N
A 61 Y
A 62 N
A 63 Y
A 64 Y
A 65 Y
B 15 Y
B 36 Y
B 57 N
B 58 N
B 60 Y
;
run;

data want;
  set have;
  by usubjid  avalc notsorted;
  if avalc='Y' and last.avalc=1 and last.usubjid=0 then flag=1;
  else flag=0;
run;

Now if your rule is to flag only the first qualifying observations for each ID (i.e. 3rd A and 2nd B), very little needs to be changed: 

 

data want (drop=_:);
  set have;
  by usubjid  avalc notsorted;
  if first.usubjid=1 then _count=0;
  if avalc='Y' and last.avalc=1 and last.usubjid=0 AND _count=0 then flag=1;
  else flag=0;
  _count+flag;
run;

 

Adding this 3rd scenario DOES require multiple sets:

 

data want (drop=_:);
  set have; 
  by usubjid avalc notsorted;
  retain _last_y ;           /*Indicates Final instance of Y preceding an N */
  if avalc='Y' and last.avalc=1 and last.usubjid=0 then _last_y=_n_;

  if last.usubjid then do until (last.usubjid);
    set have;
    by usubjid;
    _n+1;
    if _n=_last_y then flag=1;
    else flag=0;
    output;
  end;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
SerenaJJ
Obsidian | Level 7
Thank you so much for your help! It works well!!
Ksharp
Super User
 

data have;
input usubjid $ ADY AVALC $;
datalines;
A 15 N
A 36 N
A 57 Y
A 60 N
A 61 Y
A 62 N
A 63 Y
A 64 Y
A 65 Y
B 15 Y
B 36 Y
B 57 N
B 58 N
B 60 Y
;
run;

data want;
 set have;
 by usubjid AVALC notsorted;
 retain found 0;
 if first.usubjid then found=0;
 if last.AVALC and AVALC='Y' and not found then do;found=1;flag=1;end;
 drop found;
run;
Mazi
Pyrite | Level 9
data have;
input usubjid $ ADY AVALC $;
datalines;
A 15 Y
A 36 N
A 57 Y
A 60 N
A 61 Y
A 62 N
A 63 Y
A 64 N
A 65 Y
B 15 Y
B 36 Y
B 57 N
B 58 N
B 60 Y
;
run;

data want;
  do until(last.usubjid);
  	set have;
  	by usubjid;
	row+1;
 	if ^last.usubjid and avalc = 'Y' then do;
		next=row+1;
		set have (keep = avalc rename=(avalc = _avalc)) point=next;
		if _avalc = 'N' then last = ady;
	end;
  end;
  do until(last.usubjid);
  	set have;
  	by usubjid;
	if ady = last then flag = 'Y';
 	output;
	call missing(flag);
  end;
  call missing(last);
  drop _avalc row last;
run;

Can you try this?

SerenaJJ
Obsidian | Level 7
Thank you so much for the coding. Based on the specification and the final dataset for QC, they have the third A and second B subjids to be flagged. Here I can get the fifth A and second B flagged. Anyway, really appreciate your help!!
SerenaJJ
Obsidian | Level 7
Thank you so much for the solution!!!
SerenaJJ
Obsidian | Level 7

Thank you so much for the coding. It works!!

SerenaJJ
Obsidian | Level 7

Thank you so much for all the suggestion and silutions you share with me!! 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 983 views
  • 5 likes
  • 6 in conversation