- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Good Day,
I'll start with a little background. I have a dataset like the following :
Patient_ID | Admit_Date | Disch_Date | Facility_Type | Flag |
---|---|---|---|---|
12345 | 02JAN2014:00:00:00.000 | 05JAN2014:00:00:00.000 | Nursing Home | |
12345 | 05JAN2014:00:00:00.000 | 07JAN2014:00:00:00.000 | Hospital | |
23456 | 02JAN2014:00:00:00.000 | 06JAN2014:00:00:00.000 | Nursing Home | |
23456 | 06JAN2014:00:00:00.000 | 09JAN2014:00:00:00.000 | Hospital | |
23456 | 08JAN2014:00:00:00.000 | 11JAN2014:00:00:00.000 | Nursing Home | |
23456 | 11JAN2014:00:00:00.000 | 19JAN2014:00:00:00.000 | Hospital | |
23456 | 19JAN2014:00:00:00.000 | 23JAN2014:00:00:00.000 | Nursing Home | |
23456 | 24JAN2014:00:00:00.000 | 25JAN2014:00:00:00.000 | Hospital | |
23456 | 25JAN2014:00:00:00.000 | 30JAN2014:00:00:00.000 | Nursing Home | |
34567 | 09JAN2014:00:00:00.000 | 15JAN2014:00:00:00.000 | Nursing Home | |
45678 | 13JAN2014:00:00:00.000 | 20JAN2014:00:00:00.000 | Nursing Home | |
etc.... |
I would like to loop through this dataset and update the Flag column with a 1 depending on information in the next row. The logic I need to use is something like the following :
I = 1
do while I <= "num rows in dataset"
If Patient_ID[I+1] = Patient_ID and
Facility_Type[I+1] = "Hospital" and
Facility_Type = "Nursing Home" and
datepart(Admit_Date[I + 1]) >= datepart(Admit_Date) and
datepart(Admit_Date[I + 1]) <= datepart(Disch_Date)
Then
Flag = 1
Else Flag = 0
I = I+1
Next
This would update the dataset to be the following :
Patient_ID | Admit_Date | Disch_Date | Facility_Type | Flag |
---|---|---|---|---|
12345 | 02JAN2014:00:00:00.000 | 05JAN2014:00:00:00.000 | Nursing Home | 1 |
12345 | 05JAN2014:00:00:00.000 | 07JAN2014:00:00:00.000 | Hospital | 0 |
23456 | 02JAN2014:00:00:00.000 | 06JAN2014:00:00:00.000 | Nursing Home | 1 |
23456 | 06JAN2014:00:00:00.000 | 09JAN2014:00:00:00.000 | Hospital | 0 |
23456 | 09JAN2014:00:00:00.000 | 11JAN2014:00:00:00.000 | Nursing Home | 1 |
23456 | 10JAN2014:00:00:00.000 | 19JAN2014:00:00:00.000 | Hospital | 0 |
23456 | 19JAN2014:00:00:00.000 | 23JAN2014:00:00:00.000 | Nursing Home | 0 |
23456 | 24JAN2014:00:00:00.000 | 25JAN2014:00:00:00.000 | Hospital | 0 |
23456 | 25JAN2014:00:00:00.000 | 30JAN2014:00:00:00.000 | Nursing Home | 0 |
34567 | 09JAN2014:00:00:00.000 | 15JAN2014:00:00:00.000 | Nursing Home | 0 |
45678 | 13JAN2014:00:00:00.000 | 20JAN2014:00:00:00.000 | Nursing Home | 0 |
etc.... |
I have been going nuts trying to figure out how to code this in SAS. There's tons of documentation online that sort of applies, but nothing specific enough and try as I might, I can't seem to work it out. I've been tempted to export the data to a file, parse it using VBS and then reimport, but that is horribly backwards and inefficient. Plus, I won't learn anything about SAS.
Any help would be greatly appreciated.
Thanks,
John
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
SAS has the LAG function to look backwards as LinusH pointed out. However there is no looking forward function. But you can kind of simulate this using a second SET statement with the POINT= option so that you will read the I+1 observation, you will need to rename the variables from the second SET statement, so that you can compare. Taking your logic here is some sample code.
infile cards dlm=",";
input
Patient_ID : 8.
Admit_Date : anydtdtm.
Disch_Date : anydtdtm.
Facility_Type : $32.
;
format admit_date disch_date datetime19.;
cards;
12345,02JAN2014:00:00:00.000,05JAN2014:00:00:00.000,Nursing Home
12345,05JAN2014:00:00:00.000,07JAN2014:00:00:00.000,Hospital
23456,02JAN2014:00:00:00.000,06JAN2014:00:00:00.000,Nursing Home
23456,06JAN2014:00:00:00.000,09JAN2014:00:00:00.000,Hospital
23456,08JAN2014:00:00:00.000,11JAN2014:00:00:00.000,Nursing Home
23456,11JAN2014:00:00:00.000,19JAN2014:00:00:00.000,Hospital
23456,19JAN2014:00:00:00.000,23JAN2014:00:00:00.000,Nursing Home
23456,24JAN2014:00:00:00.000,25JAN2014:00:00:00.000,Hospital
23456,25JAN2014:00:00:00.000,30JAN2014:00:00:00.000,Nursing Home
34567,09JAN2014:00:00:00.000,15JAN2014:00:00:00.000,Nursing Home
45678,13JAN2014:00:00:00.000,20JAN2014:00:00:00.000,Nursing Home
;
data want;
obs1 = 1;
do while( obs1 <= nobs);
set have nobs=nobs;
obs2 = obs1 + 1;
set
have(
rename=(
Patient_ID = pId2
Admit_Date = ad2
Disch_Date = dd2
Facility_Type = ft2
)
) point=obs2
;
If pId2 = Patient_ID and
ft2 = "Hospital" and
Facility_Type = "Nursing Home" and
datepart(ad2) >= datepart(Admit_Date) and
datepart(ad2) <= datepart(Disch_Date) Then
Flag = 1;
Else Flag = 0;
output;
obs1 + 1;
end;
drop obs1 obs2 pId2 ad2 dd2 ft2;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you resort your data descending, you could use retain statement (or lag() function) to keep the previous rows values in the current row. That would replace the "i+1" in your code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
SAS has the LAG function to look backwards as LinusH pointed out. However there is no looking forward function. But you can kind of simulate this using a second SET statement with the POINT= option so that you will read the I+1 observation, you will need to rename the variables from the second SET statement, so that you can compare. Taking your logic here is some sample code.
infile cards dlm=",";
input
Patient_ID : 8.
Admit_Date : anydtdtm.
Disch_Date : anydtdtm.
Facility_Type : $32.
;
format admit_date disch_date datetime19.;
cards;
12345,02JAN2014:00:00:00.000,05JAN2014:00:00:00.000,Nursing Home
12345,05JAN2014:00:00:00.000,07JAN2014:00:00:00.000,Hospital
23456,02JAN2014:00:00:00.000,06JAN2014:00:00:00.000,Nursing Home
23456,06JAN2014:00:00:00.000,09JAN2014:00:00:00.000,Hospital
23456,08JAN2014:00:00:00.000,11JAN2014:00:00:00.000,Nursing Home
23456,11JAN2014:00:00:00.000,19JAN2014:00:00:00.000,Hospital
23456,19JAN2014:00:00:00.000,23JAN2014:00:00:00.000,Nursing Home
23456,24JAN2014:00:00:00.000,25JAN2014:00:00:00.000,Hospital
23456,25JAN2014:00:00:00.000,30JAN2014:00:00:00.000,Nursing Home
34567,09JAN2014:00:00:00.000,15JAN2014:00:00:00.000,Nursing Home
45678,13JAN2014:00:00:00.000,20JAN2014:00:00:00.000,Nursing Home
;
data want;
obs1 = 1;
do while( obs1 <= nobs);
set have nobs=nobs;
obs2 = obs1 + 1;
set
have(
rename=(
Patient_ID = pId2
Admit_Date = ad2
Disch_Date = dd2
Facility_Type = ft2
)
) point=obs2
;
If pId2 = Patient_ID and
ft2 = "Hospital" and
Facility_Type = "Nursing Home" and
datepart(ad2) >= datepart(Admit_Date) and
datepart(ad2) <= datepart(Disch_Date) Then
Flag = 1;
Else Flag = 0;
output;
obs1 + 1;
end;
drop obs1 obs2 pId2 ad2 dd2 ft2;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
WOW!! This worked perfectly!! Thank you so much! I never would have arrived there without your help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Another way to do this is with a MERGE statement without a BY statement and FIRSTOBS options, like:
/* using same data member HAVE from above */
data want;
MERGE have (firstobs=1)
have (firstobs=2
rename=(Patient_ID = pId2
Admit_Date=ad2
Disch_Date=dd2
Facility_Type=ft2)
);
if pId2 = Patient_ID
and ft2 = "Hospital"
and Facility_Type = "Nursing Home"
and datepart(ad2) >= datepart(Admit_Date)
and datepart(ad2) <= datepart(Disch_Date) then Flag = 1;
else Flag = 0;
drop pid2 ad2 dd2 ft2;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Good thought!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Starting from the Have dataset from the previous answers, this gives the same result:
proc sql;
create table want as
select a.*, case when a.Facility_type='Nursing Home' and b.facility_type='Hospital' and
b.admit_date le a.disch_date then 1
else 0
end as flag
from have a
left join have b
on a.patient_id=b.patient_id
and b.admit_date gt a.admit_date
group by a.patient_id, a.admit_date
having b.admit_date=min(b.admit_date);
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Super helpful tip! My coworker and I do some crazy lag stuff when we need to do something like this. So happy to have an easier alternative.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Shout out to who shared a macro version of this approach on his blog: LagLead Macro.
Chris