Good Day,
I'll start with a little background. I have a dataset like the following :
Patient_ID | Admit_Date | Disch_Date | Facility_Type | Flag |
---|---|---|---|---|
12345 | 02JAN2014:00:00:00.000 | 05JAN2014:00:00:00.000 | Nursing Home | |
12345 | 05JAN2014:00:00:00.000 | 07JAN2014:00:00:00.000 | Hospital | |
23456 | 02JAN2014:00:00:00.000 | 06JAN2014:00:00:00.000 | Nursing Home | |
23456 | 06JAN2014:00:00:00.000 | 09JAN2014:00:00:00.000 | Hospital | |
23456 | 08JAN2014:00:00:00.000 | 11JAN2014:00:00:00.000 | Nursing Home | |
23456 | 11JAN2014:00:00:00.000 | 19JAN2014:00:00:00.000 | Hospital | |
23456 | 19JAN2014:00:00:00.000 | 23JAN2014:00:00:00.000 | Nursing Home | |
23456 | 24JAN2014:00:00:00.000 | 25JAN2014:00:00:00.000 | Hospital | |
23456 | 25JAN2014:00:00:00.000 | 30JAN2014:00:00:00.000 | Nursing Home | |
34567 | 09JAN2014:00:00:00.000 | 15JAN2014:00:00:00.000 | Nursing Home | |
45678 | 13JAN2014:00:00:00.000 | 20JAN2014:00:00:00.000 | Nursing Home | |
etc.... |
I would like to loop through this dataset and update the Flag column with a 1 depending on information in the next row. The logic I need to use is something like the following :
I = 1
do while I <= "num rows in dataset"
If Patient_ID[I+1] = Patient_ID and
Facility_Type[I+1] = "Hospital" and
Facility_Type = "Nursing Home" and
datepart(Admit_Date[I + 1]) >= datepart(Admit_Date) and
datepart(Admit_Date[I + 1]) <= datepart(Disch_Date)
Then
Flag = 1
Else Flag = 0
I = I+1
Next
This would update the dataset to be the following :
Patient_ID | Admit_Date | Disch_Date | Facility_Type | Flag |
---|---|---|---|---|
12345 | 02JAN2014:00:00:00.000 | 05JAN2014:00:00:00.000 | Nursing Home | 1 |
12345 | 05JAN2014:00:00:00.000 | 07JAN2014:00:00:00.000 | Hospital | 0 |
23456 | 02JAN2014:00:00:00.000 | 06JAN2014:00:00:00.000 | Nursing Home | 1 |
23456 | 06JAN2014:00:00:00.000 | 09JAN2014:00:00:00.000 | Hospital | 0 |
23456 | 09JAN2014:00:00:00.000 | 11JAN2014:00:00:00.000 | Nursing Home | 1 |
23456 | 10JAN2014:00:00:00.000 | 19JAN2014:00:00:00.000 | Hospital | 0 |
23456 | 19JAN2014:00:00:00.000 | 23JAN2014:00:00:00.000 | Nursing Home | 0 |
23456 | 24JAN2014:00:00:00.000 | 25JAN2014:00:00:00.000 | Hospital | 0 |
23456 | 25JAN2014:00:00:00.000 | 30JAN2014:00:00:00.000 | Nursing Home | 0 |
34567 | 09JAN2014:00:00:00.000 | 15JAN2014:00:00:00.000 | Nursing Home | 0 |
45678 | 13JAN2014:00:00:00.000 | 20JAN2014:00:00:00.000 | Nursing Home | 0 |
etc.... |
I have been going nuts trying to figure out how to code this in SAS. There's tons of documentation online that sort of applies, but nothing specific enough and try as I might, I can't seem to work it out. I've been tempted to export the data to a file, parse it using VBS and then reimport, but that is horribly backwards and inefficient. Plus, I won't learn anything about SAS.
Any help would be greatly appreciated.
Thanks,
John
SAS has the LAG function to look backwards as LinusH pointed out. However there is no looking forward function. But you can kind of simulate this using a second SET statement with the POINT= option so that you will read the I+1 observation, you will need to rename the variables from the second SET statement, so that you can compare. Taking your logic here is some sample code.
If you resort your data descending, you could use retain statement (or lag() function) to keep the previous rows values in the current row. That would replace the "i+1" in your code.
SAS has the LAG function to look backwards as LinusH pointed out. However there is no looking forward function. But you can kind of simulate this using a second SET statement with the POINT= option so that you will read the I+1 observation, you will need to rename the variables from the second SET statement, so that you can compare. Taking your logic here is some sample code.
WOW!! This worked perfectly!! Thank you so much! I never would have arrived there without your help.
Another way to do this is with a MERGE statement without a BY statement and FIRSTOBS options, like:
/* using same data member HAVE from above */
data want;
MERGE have (firstobs=1)
have (firstobs=2
rename=(Patient_ID = pId2
Admit_Date=ad2
Disch_Date=dd2
Facility_Type=ft2)
);
if pId2 = Patient_ID
and ft2 = "Hospital"
and Facility_Type = "Nursing Home"
and datepart(ad2) >= datepart(Admit_Date)
and datepart(ad2) <= datepart(Disch_Date) then Flag = 1;
else Flag = 0;
drop pid2 ad2 dd2 ft2;
run;
Good thought!
Starting from the Have dataset from the previous answers, this gives the same result:
proc sql;
create table want as
select a.*, case when a.Facility_type='Nursing Home' and b.facility_type='Hospital' and
b.admit_date le a.disch_date then 1
else 0
end as flag
from have a
left join have b
on a.patient_id=b.patient_id
and b.admit_date gt a.admit_date
group by a.patient_id, a.admit_date
having b.admit_date=min(b.admit_date);
quit;
Super helpful tip! My coworker and I do some crazy lag stuff when we need to do something like this. So happy to have an easier alternative.
Shout out to who shared a macro version of this approach on his blog: LagLead Macro.
Chris
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.