Hi
I wanted to identify a long-term stay through discharge status given patient ID and service begin and end dates. As the example data shows below, when discharge status is 3 or missing, it means one claim/record is a part of a stay. For example, the first two records for pat_ID is actually one stay as discharge_status is 3 in the first record. When discharge status is not 3 or missing, it means one patient was discharged somewhere else.
Any suggestion is greatly appreciated!
pat_ID | service_beg_dt | service_end_dt | discharge_status |
1 | 1/1/2018 | 3/31/2018 | 3 |
1 | 1/1/2018 | 4/30/2018 | 1 |
1 | 5/16/2018 | 5/31/2018 | 3 |
1 | 5/16/2018 | 6/30/2018 | 3 |
1 | 5/16/2018 | 7/31/2018 | 3 |
1 | 5/16/2018 | 8/31/2018 | 3 |
1 | 5/16/2018 | 9/30/2018 | 3 |
1 | 5/16/2018 | 10/31/2018 | 3 |
1 | 5/16/2018 | 11/30/2018 | 3 |
1 | 5/16/2018 | 12/31/2018 | 3 |
2 | 11/1/2017 | 1/31/2018 | 1 |
2 | 2/12/2018 | 5/1/2018 | 3 |
2 | 5/1/2018 | 5/31/2018 | 2 |
Given we don't have sample data in the form of a working data step, you'll need to test the code below:
data want (drop=service_beg_dt rename=(service_end_dt=stay_end_dt) );
do until (discharge_status^=3 or last.pat_id=1);
set have;
by pat_id;
stay_beg_dt=min(stay_beg_dt,service_beg_dt);
end;
length_of_stay=service_end_dt-stay_beg_dt;
format stay_beg_dt mmddyy10. ;
run;
This assumes that the only continuation indicator is DISCHARGE_STATUS=3, and that the data are sorted by PAT_ID/SERVICE_END_DT.
Can you please show us the desired output from this data?
I have to admit, I don't think the explanation is very clear of what a long-term stay is, can you explain in more detail?
Thanks for the response!
The desired output would be:
pat_ID | service_beg_dt | service_end_dt | discharge_status |
1 | 1/1/2018 | 4/30/2018 | 1 |
1 | 5/16/2018 | 12/31/2018 | 3 |
2 | 11/1/2017 | 1/31/2018 | 1 |
2 | 2/12/2018 | 5/31/2018 | 2 |
where the begin and end date of a stay are captured, and the final discharge status is also recorded. The time frame is Jan. 1, 2018 to Dec. 31, 2018, which is why for pat_ID=1 had the discharge status=3 in the second stay.
I am sure that i didn't fully understand the logic, but this step seems to produce the expected result:
data want;
set have;
if discharge_status ^= 3 or service_end_dt = '31Dec2018'd;
run;
The tables below may help. I should have used the title "Construct long-term stays". When the discharge status of a record is 3-still patient, it means the stay continues to the next record. When the discharge status is not 3 (1-discharge to home, 2-discharge to inpatient care), it means a stay ends.
For example, the first record for pat_ID=1 has discharge status 3 (still patient), which means the stay continues to the next record till 4/30/2018 when the patient discharged to home (discharge_status=1). The next stay for pat_ID=1 went from 5/16/2018 to 12/31/2018 as the rest of records all have discharge_status=3.
For pat_ID=2, the record with service dates 1/1/2018-1/31/2018 has discharge_status=1, so the single record is a complete stay. The second and third records are for one stay because the second record has discharge_status=3 (still patient).
Assuming that the study timeframe is 1/1/2018 - 12/31/2018.
Thank you!!
Original data
pat_ID | service_beg_dt | service_end_dt | discharge_status | Note |
1 | 1/1/2018 | 3/31/2018 | 3 | 3 - still patient |
1 | 1/1/2018 | 4/30/2018 | 1 | 1 - discharge to home |
1 | 5/16/2018 | 5/31/2018 | 3 | |
1 | 5/16/2018 | 6/30/2018 | 3 | |
1 | 5/16/2018 | 7/31/2018 | 3 | |
1 | 5/16/2018 | 8/31/2018 | 3 | |
1 | 5/16/2018 | 9/30/2018 | 3 | |
1 | 5/16/2018 | 10/31/2018 | 3 | |
1 | 5/16/2018 | 11/30/2018 | 3 | |
1 | 5/16/2018 | 12/31/2018 | 3 | end of study timeframe, still patient |
2 | 1/1/2018 | 1/31/2018 | 1 | 1 - discharge to home |
2 | 2/12/2018 | 5/1/2018 | 3 | 3 - still patient |
2 | 5/1/2018 | 5/31/2018 | 2 | 2 - discharge to inpatient care |
Desired output:
pat_ID | stay_beg_dt | stay_end_dt | discharge_status | Length_of_stay |
1 | 1/1/2018 | 4/30/2018 | 1 | 119 |
1 | 5/16/2018 | 12/31/2018 | 3 | 229 |
2 | 11/1/2017 | 1/31/2018 | 1 | 91 |
2 | 2/12/2018 | 5/31/2018 | 2 | 108 |
I hope the side-by-side tables of the original data and the desired output can make my intent a bit more clear. I should have used title "Construct long-term stays". In the desired output, the begin and end dates of a stay are identified from the original data.
For example, for pat_ID=1, the first record (1/1/2018 - 3/31/2018) has discharge status=3 (still patient), so the stay continues to the next record and the end date of the stay is 4/30/2018 as the patient is discharged to home.
The rest of records for pat_ID=1 go from 5/16/2018 to 12/31/2018 with the same discharge status 3-still patient, so the patient remains in the same stay during the period.
The first record for pat_ID=2 has discharge status 1 - discharged to home, so the single record is a complete stay. The rest of records are for another stay as the record with service dates 2/12/2018-5/1/2018 has discharge status 3-still patient, and it continued to the next record with service end date of 5/31/2018.
Length of stay is the number of days from stay begin date to stay end date.
Thank you very much!
Original Data Desired Output
pat_ID | service_beg_dt | service_end_dt | discharge_status | Note | pat_ID | stay_beg_dt | stay_end_dt | discharge_status | Length_of_stay |
1 | 1/1/2018 | 3/31/2018 | 3 | 3 - still patient | 1 | 1/1/2018 | 4/30/2018 | 1 | 119 |
1 | 1/1/2018 | 4/30/2018 | 1 | 1 - discharge to home | 1 | 5/16/2018 | 12/31/2018 | 3 | 229 |
1 | 5/16/2018 | 5/31/2018 | 3 | 2 | 11/1/2017 | 1/31/2018 | 1 | 91 | |
1 | 5/16/2018 | 6/30/2018 | 3 | 2 | 2/12/2018 | 5/31/2018 | 2 | 108 | |
1 | 5/16/2018 | 7/31/2018 | 3 | ||||||
1 | 5/16/2018 | 8/31/2018 | 3 | ||||||
1 | 5/16/2018 | 9/30/2018 | 3 | ||||||
1 | 5/16/2018 | 10/31/2018 | 3 | ||||||
1 | 5/16/2018 | 11/30/2018 | 3 | ||||||
1 | 5/16/2018 | 12/31/2018 | 3 | end of study timeframe, still patient | |||||
2 | 11/1/2017 | 1/31/2018 | 1 | 1 - discharge to home | |||||
2 | 2/12/2018 | 5/1/2018 | 3 | 3 - still patient | |||||
2 | 5/1/2018 | 5/31/2018 | 2 | 2 - discharge to inpatient care |
And now please post the original data as working data step using datalines or cards.
data have;
infile datalines truncover dsd;
input pat_ID (service_beg_dt service_end_dt) (:mmddyy10.) discharge_status;
format service_beg_dt service_end_dt mmddyy10.;
datalines;
1,1/1/2018,3/31/2018,3
1,1/1/2018,4/30/2018,1
1,5/16/2018,5/31/2018,3
1,5/16/2018,6/30/2018,3
1,5/16/2018,7/31/2018,3
1,5/16/2018,8/31/2018,3
1,5/16/2018,9/30/2018,3
1,5/16/2018,10/31/2018,3
1,5/16/2018,11/30/2018,3
1,5/16/2018,12/31/2018,3
2,1/1/2018,1/31/2018,1
2,2/12/2018,5/1/2018,3
2,5/1/2018,5/31/2018,2
;
Given we don't have sample data in the form of a working data step, you'll need to test the code below:
data want (drop=service_beg_dt rename=(service_end_dt=stay_end_dt) );
do until (discharge_status^=3 or last.pat_id=1);
set have;
by pat_id;
stay_beg_dt=min(stay_beg_dt,service_beg_dt);
end;
length_of_stay=service_end_dt-stay_beg_dt;
format stay_beg_dt mmddyy10. ;
run;
This assumes that the only continuation indicator is DISCHARGE_STATUS=3, and that the data are sorted by PAT_ID/SERVICE_END_DT.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.