DATA Step, Macro, Functions and more

Extract records after second last 'PASS' by ID

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

Extract records after second last 'PASS' by ID

Hello,

I have a dataset that lists records by ID of exam date and result. I only want to output the data after an ID's second last 'PASS'. How do I achieve this?

Many thanks.

Max



Source data:


ID   DATE         RESULT

85  15JAN2010 FAIL
85  25JAN2010 PASS
85  15FEB2011 FAIL
85  25FEB2011 PASS
85  01MAR2011 FAIL
85  07MAR2011 FAIL
85  10MAR2011 PASS
66  10OCT2015 FAIL
66  15OCT2015 PASS
66  10SEP2016 FAIL
66  15SEP2016 PASS


Desired output data:

85  01MAR2011 FAIL
85  07MAR2011 FAIL
85  10MAR2011 PASS
66  10SEP2016 FAIL
66  15SEP2016 PASS


Accepted Solutions
Solution
‎11-22-2016 01:18 PM
Trusted Advisor
Posts: 1,022

Re: Extract records after second last 'PASS' by ID

[ Edited ]

Your data appears to be sorted by date for each ID, but not sorted by ID.  You can get what you need without further sorting of the data:

 

data have;
informat id $2. date date9. status $12.;
format date date9.;
input id date status $;
cards;
85  15JAN2010 FAIL
85  25JAN2010 PASS
85  15FEB2011 FAIL
85  25FEB2011 PASS
85  01MAR2011 FAIL 
85  07MAR2011 FAIL 
85  10MAR2011 PASS
66  10OCT2015 FAIL
66  15OCT2015 PASS
66  10SEP2016 FAIL 
66  15SEP2016 PASS 
;
run;


data want (drop=npass cutpass);
  /* Count total number of "PASS" records in the group */
  do npass=0 by 0 until (last.id);
    set have;
	by id notsorted;
    if status='PASS' then npass=npass+1;
  end;

  cutpass=npass-1; 

  /* Reread,, but output only starting with record after NPASS reaches CUTPASS*/
  do npass=0 by 0 until (last.id);
    set have;
	by id notsorted;
	if npass>=cutpass then output;
    /* if npass>=cutpass>0 then output; */
if status="PASS" then npass=npass+1; end; run;

 

@PGStats pointed out that my suggestion would output groups in which there existed no second-to-last "PASS" record. Excellent observation, and easily corrected.  Using the commented "if npass ..." line instead of the original would address that issue.

 

Regards,

Mark

View solution in original post


All Replies
Super User
Posts: 19,815

Re: Extract records after second last 'PASS' by ID

If you reverse the order of your data, ID by date descending, you can take records from the first until the second pass which is a much easier logical problem to solve. 

 

 

New Contributor
Posts: 3

Re: Extract records after second last 'PASS' by ID

Can you provide a sample code? Thanks.

 

 

Super User
Posts: 19,815

Re: Extract records after second last 'PASS' by ID

data have;
informat id $2. date date9. status $12.;
format date date9.;
input id date status $;
cards;
85  15JAN2010 FAIL
85  25JAN2010 PASS
85  15FEB2011 FAIL
85  25FEB2011 PASS
85  01MAR2011 FAIL 
85  07MAR2011 FAIL 
85  10MAR2011 PASS
66  10OCT2015 FAIL
66  15OCT2015 PASS
66  10SEP2016 FAIL 
66  15SEP2016 PASS 
;
run;

proc sort data=have;
by id descending date;
run;

data want;
set have;
by id descending date;

retain flag;

if first.id then flag=0;
else if status='PASS' then flag=1;

if flag=0 then output;
run;

proc sort data=want;
by id date;
run;
proc print; run;
Respected Advisor
Posts: 4,925

Re: Extract records after second last 'PASS' by ID

[ Edited ]

A single step solution:

 

data want;
do until (last.id);
    set have(where=(status="PASS")); by id notsorted;
    if not last.id then passDate = date;
    end;
thisId = id;
do until (id = thisId and last.id);
    set have; by id notsorted;
    if id = thisId and -date < -passDate then output;
    end;
drop thisId passDate;
run;

proc print noobs; run;

Note that this code skips Ids when all records for an Id are FAILs or there is a single PASS record.  Other solutions proposed by @Reeza and @mkeintz output all obs in those cases.

PG
Solution
‎11-22-2016 01:18 PM
Trusted Advisor
Posts: 1,022

Re: Extract records after second last 'PASS' by ID

[ Edited ]

Your data appears to be sorted by date for each ID, but not sorted by ID.  You can get what you need without further sorting of the data:

 

data have;
informat id $2. date date9. status $12.;
format date date9.;
input id date status $;
cards;
85  15JAN2010 FAIL
85  25JAN2010 PASS
85  15FEB2011 FAIL
85  25FEB2011 PASS
85  01MAR2011 FAIL 
85  07MAR2011 FAIL 
85  10MAR2011 PASS
66  10OCT2015 FAIL
66  15OCT2015 PASS
66  10SEP2016 FAIL 
66  15SEP2016 PASS 
;
run;


data want (drop=npass cutpass);
  /* Count total number of "PASS" records in the group */
  do npass=0 by 0 until (last.id);
    set have;
	by id notsorted;
    if status='PASS' then npass=npass+1;
  end;

  cutpass=npass-1; 

  /* Reread,, but output only starting with record after NPASS reaches CUTPASS*/
  do npass=0 by 0 until (last.id);
    set have;
	by id notsorted;
	if npass>=cutpass then output;
    /* if npass>=cutpass>0 then output; */
if status="PASS" then npass=npass+1; end; run;

 

@PGStats pointed out that my suggestion would output groups in which there existed no second-to-last "PASS" record. Excellent observation, and easily corrected.  Using the commented "if npass ..." line instead of the original would address that issue.

 

Regards,

Mark

New Contributor
Posts: 3

Re: Extract records after second last 'PASS' by ID

Thanks so much for everyone who responded to my post! Thumbs up to SAS Communities!

 

 

Max

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 249 views
  • 1 like
  • 4 in conversation