DATA Step, Macro, Functions and more

Do Until Loop

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 106
Accepted Solution

Do Until Loop

I have an issue that I think might work best with a do until loop. I have patient data for multiple hospital admissions and I want to keep all the records up until and including their first positive test, but delete the records after the first positive record.

Here's the data I have:

repid datepostive
1251/12/20140
1251/15/20141
14474/12/20130
144711/12/20130
14475/6/20140
14476/7/20140
144711/20/20141
14471/10/20150
14473/15/20150
884591/6/20140
45487812/8/20140
3162/9/20140
3163/18/20140
3166/2/20140
3169/28/20141
31612/1/20140

Want:

repid datetive
1251/12/20140
1251/15/20141
14474/12/20130
144711/12/20130
14475/6/20140
14476/7/20140
144711/20/20141
884591/6/20140
45487812/8/20140
3162/9/20140
3163/18/20140
3166/2/20140
3169/28/20141

Thanks,

Sarah


Accepted Solutions
Solution
‎04-03-2015 11:36 PM
Respected Advisor
Posts: 4,937

Re: Do Until Loop

Posted in reply to sarahsasuser

Maybe not the DO UNTIL you had in mind...

data have;

input repid date :mmddyy. positive;

format date yymmdd10.;

datalines;

125 1/12/2014 0

125 1/15/2014 1

1447 4/12/2013 0

1447 11/12/2013 0

1447 5/6/2014 0

1447 6/7/2014 0

1447 11/20/2014 1

1447 1/10/2015 0

1447 3/15/2015 0

88459 1/6/2014 0

454878 12/8/2014 0

316 2/9/2014 0

316 3/18/2014 0

316 6/2/2014 0

316 9/28/2014 1

316 12/1/2014 0

;

proc sort data=have; by repid date; run;

data want;

do until(last.repid);

    set have; by repid;

    if not pos then output;

    if positive then pos = 1;

    end;

drop pos;

run;

proc print data=want noobs; run;

PG

PG

View solution in original post


All Replies
Solution
‎04-03-2015 11:36 PM
Respected Advisor
Posts: 4,937

Re: Do Until Loop

Posted in reply to sarahsasuser

Maybe not the DO UNTIL you had in mind...

data have;

input repid date :mmddyy. positive;

format date yymmdd10.;

datalines;

125 1/12/2014 0

125 1/15/2014 1

1447 4/12/2013 0

1447 11/12/2013 0

1447 5/6/2014 0

1447 6/7/2014 0

1447 11/20/2014 1

1447 1/10/2015 0

1447 3/15/2015 0

88459 1/6/2014 0

454878 12/8/2014 0

316 2/9/2014 0

316 3/18/2014 0

316 6/2/2014 0

316 9/28/2014 1

316 12/1/2014 0

;

proc sort data=have; by repid date; run;

data want;

do until(last.repid);

    set have; by repid;

    if not pos then output;

    if positive then pos = 1;

    end;

drop pos;

run;

proc print data=want noobs; run;

PG

PG
Frequent Contributor
Posts: 106

Re: Do Until Loop

Thanks PGStats!

Do you mind walking me through the code a bit? the last.repid takes the last observation for that repid and outputs all the pos=0 until it gets to positive=1. Why do we need to make the variable 'pos'? This seems like an extra step. Could we have the following code? If not, why not?

do until (last.repid);

     set have;

     by repid;

     if positive ^=1 then output;

end;

Also, why are the set and by statements inside of the do loop?

Thanks for you help!

Respected Advisor
Posts: 4,937

Re: Do Until Loop

Posted in reply to sarahsasuser

Variable pos only takes two values: missing or 1. It is set to missing at the beginning of every iteration of the implicit datastep loop. In a logical test, a missing value is taken as FALSE. Thus every record in a repid group is output until after pos is assigned a value, i.e. after positive=1 has been met. Once it is assigned a value, variable pos only goes back to missing at the beginning of the next repid group.

PG

PG
Frequent Contributor
Posts: 106

Re: Do Until Loop

Thanks, this is very helpful.

Frequent Contributor
Posts: 106

Re: Do Until Loop

I have one more question about the last.repid function. I've added 4 more observations to the dataset:

41211/4/20090
4124/5/20131
4123/14/20141
4121/2/20150

I want to keep only the first 2 observations since I'm only interested in the volunteers until they become positive. I've used the same code you gave me and it does what I want. However, I thought last.repid would start from the bottom, work up the rows, giving me the last 2 observations. Why doesn't it?

Respected Advisor
Posts: 4,937

Re: Do Until Loop

Posted in reply to sarahsasuser

Unless you use the point= option, the set statement reads dataset observations one by one in the order they are stored. last.repid is a variable created by SAS that takes the value of 1 (true) when the next record in the dataset is not from the same repid group.

Read the documentation about first. and last. automatic variables to understand what's going on.

SAS(R) 9.4 Language Reference: Concepts, Fourth Edition

PG

PG
Frequent Contributor
Posts: 115

Re: Do Until Loop

Posted in reply to sarahsasuser

data have;

input repid date :mmddyy. positive;

format date yymmdd10.;

datalines;

125 1/12/2014 0

125 1/15/2014 1

1447 4/12/2013 0

1447 11/12/2013 0

1447 5/6/2014 0

1447 6/7/2014 0

1447 11/20/2014 1

1447 1/10/2015 0

1447 3/15/2015 0

88459 1/6/2014 0

454878 12/8/2014 0

316 2/9/2014 0

316 3/18/2014 0

316 6/2/2014 0

316 9/28/2014 1

316 12/1/2014 0

;

data want;

do _n_=1 by 1 until(last.repid);

set have;

by repid notsorted;

if positive    then n=_n_;

end; 

do _n_=1 by 1 until(last.repid);

set have;

by repid notsorted;

if _n_<=n or n=. then output;

end;

drop n;

run;

Regards,

Naveen Srinivasan

L&T Infotech

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 477 views
  • 6 likes
  • 3 in conversation