This is my code:
data MS_AL.pers; set MS_AL.PERSISTENCE; by enrolid svcdate; if first.enrolid then do; svcdate_new=indexdt; retain svcdate_new daysupp; end; else if svcdate_new+daysupp > svcdate then do; svcdate_new = svcdate_new+daysupp; end;
else do; svcdate_new = svcdate; end; format svcdate_new mmddyy10.; run;
This gives the output like this:
Essentially what I'm trying to do is to create a new column with shifted dates - svcdate_new. The condition for shifting is that:
1. First row is always indexdt
2. For all subsequent rows, if svcdate (for example in the 2nd row it is 8/22/16) < previous svcdate_new (in this case 7/25/16) + daysupp from previous row (30 in this case), then svcdate_new (for second row) = svcdate_new from previous row (7/25/16) + daysupp from previous row (30 in this case) which it computes correctly as 8/25/16.
However, for the 3rd row, svcdate_new should be 8/24/16 +30 = 9/23/2016. However, it is incorrectly computing as svcdate.
Can someone help?
You can take advantage of the fact that variables read by a SET statement are automatically retained until that SET statement is encountered again. Usually a set statement executes once per incoming record. But below there are two sets. The second is executed the same number of times as the first, but it executes in batches - once per group. As a result the variables svcdate_new and _svdaysupp are retained until over the course of each ENROLID group:
data want (drop=_:);
set have;
by enrolid;
if first.enrolid then do until (first.enrolid);
set have (rename=(indexdt=svcdate_new daysupp=_svdaysupp));
by enrolid ;
end;
else if svcdate_new + _svdaysupp>svcdate then svcdate_new=svcdte_new + _svdaysupp;
else svcdate_new=svcdate;
_svdaysupp=daysupp;
format svcdate_new mmddyy10. ;
run;
I think the interesting feature here is the
if first.enrolid then do until (first.enrolid);
statement. The "if first.enrolid" condition refers to the PRECEDING set ... by statements. But the "do until (first.enrolid)" refers to the FOLLOWING set ... by statements.
Could you please post the sample data to copy and paste, in picture format it is not possible to copy and test.
Posting the output data snapshot:
SVCDATE | DAYSUPP | INDEXDT | ENROLID | SVCDATE_NEW |
7/25/16 | 30 | 7/25/16 | 128682503 | 7/25/16 |
8/22/16 | 30 | 7/25/16 | 128682503 | 8/24/16 |
9/13/16 | 6 | 7/25/16 | 128682503 | 9/13/16 |
9/15/16 | 2 | 7/25/16 | 128682503 | 9/15/16 |
9/20/16 | 30 | 7/25/16 | 128682503 | 10/15/16 |
10/19/16 | 30 | 7/25/16 | 128682503 | 11/14/16 |
11/17/16 | 30 | 7/25/16 | 128682503 | 12/14/16 |
Try next not tested code.
To add previous row daysupp to need to save it (remain) otherwise
the current row overrides it.
data MS_AL.pers;
set MS_AL.PERSISTENCE;
by enrolid svcdate;
retain svcdate_new svdaysupp;
if first.enrolid then do;
svcdate_new=indexdt;
svdaysupp = daysupp;
end;
else if svcdate_new+svdaysupp > svcdate then do;
svcdate_new = svcdate_new+svdaysupp;
end;
else do; svcdate_new = svcdate; end;
svdaysupp = daysupp;
drop svdaysupp; /* ? */
format svcdate_new mmddyy10.;
run;
@alaxman: Posting data in usable form, not in screenshots or tables helps us to provide working solutions.
The retain-statement is non-conditional, meaning that it is always executed, even if it is part of an if- or else-block. The retain-statement marks variables, so that in during the automatic iterations values are not reset to missing, but this does not affect variables loaded from a dataset or read with input-statement. So retaining daysup is not possible, you have to create a new variable to be retained (see the solution posted by @Shmuel)
You can take advantage of the fact that variables read by a SET statement are automatically retained until that SET statement is encountered again. Usually a set statement executes once per incoming record. But below there are two sets. The second is executed the same number of times as the first, but it executes in batches - once per group. As a result the variables svcdate_new and _svdaysupp are retained until over the course of each ENROLID group:
data want (drop=_:);
set have;
by enrolid;
if first.enrolid then do until (first.enrolid);
set have (rename=(indexdt=svcdate_new daysupp=_svdaysupp));
by enrolid ;
end;
else if svcdate_new + _svdaysupp>svcdate then svcdate_new=svcdte_new + _svdaysupp;
else svcdate_new=svcdate;
_svdaysupp=daysupp;
format svcdate_new mmddyy10. ;
run;
I think the interesting feature here is the
if first.enrolid then do until (first.enrolid);
statement. The "if first.enrolid" condition refers to the PRECEDING set ... by statements. But the "do until (first.enrolid)" refers to the FOLLOWING set ... by statements.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.