BookmarkSubscribeRSS Feed
Katie
Obsidian | Level 7

Hello.  Here is what I have:

 

PT_ID         date           result    criteria_met

   x           1/1/2015         500           yes

   x           2/1/2015         470           no

   x            3/1/2015        525          yes

   z            2/2/2015        450           no

   z            3/32015         575         yes

   k            5/4/2015        650         yes

 

basically, what I want is all the data for a particular patient up through their first 'Yes' in the criteria_met column.  So patient x will have 1 observation, patient z will have 2 and patient k will have 1 observation.

 

Any help is appreicated!  Thank you!

 

 

3 REPLIES 3
Patrick
Opal | Level 21

Code not tested but should do the job.

data want(drop=_output_flg);
  set have;
  by pt_id date;
  retain _output_flg;
  if first.pt_id then _output_flg=1;
  if _output_flg=1 then output;
  if upcase(criteria_met)='YES' then _output_flg=0;
run;
Jagadishkatam
Amethyst | Level 16

Please try

 

data have;
input PT_ID$         date: ddmmyy10.           result    criteria_met$;
format date date9.;
cards;
x           1/1/2015         500           yes
x           2/1/2015         470           no
x            3/1/2015        525          yes
z            2/2/2015        450           no
z            3/3/2015         575         yes
k            5/4/2015        650         yes
;

proc sort data=have;
by  pt_id descending criteria_met  date;
run;

data want;
do until(first.pt_id);
set have;
  by pt_id descending criteria_met  date;
if first.pt_id then yesdate=date;
end;
do until(last.pt_id);
set have;
  by pt_id descending criteria_met  date;
if date<=yesdate then output;
format yesdate date9.;
end;
run;
Thanks,
Jag
mkeintz
PROC Star

You want to output all records up through the first positive (criteria_met='yes').  So all you need to do is keep a running count of the number of positives encountered (N_POS in the program below).

 

Notes:

  1. The program below assumes the data are grouped by PT_ID (and sorted by date within each PT_ID group).  But, while grouped by PT_ID, it appears not to be in ascending PT_ID order - hence the "by pt_id NOTSORTED".  If it's not grouped, or not sorted by date within PT_ID, then sort dataset have by PT_ID/date first.
  2. The subsetting "if n_pos=0" might make you think that only records BEFORE the first positive will be output.  But note that N_POS is not updated until after the subsetting if - as a result the first positive will be output, even though N_POS has become 1.  All subsequent records within a PT_ID will be filtered out.  If you want to see the evidence remove the "drop n_pos" statement and look at the output.

 

data have;

input PT_ID$ date: ddmmyy10. result criteria_met$;

format date date9.;

cards;

x 1/1/2015 500 yes

x 2/1/2015 470 no

x 3/1/2015 525 yes

z 2/2/2015 450 no

z 3/3/2015 575 yes

k 5/4/2015 650 yes

;

data want;

  set have;

  by pt_id notsorted;

  if first.pt_id then n_pos=0;

  if n_pos=0;

  n_pos + (criteria_met='yes');

  drop n_pos;

run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1083 views
  • 0 likes
  • 4 in conversation