Hi
I wanted to identify a long-term stay through discharge status given patient ID and service begin and end dates. As the example data shows below, when discharge status is 3 or missing, it means one claim/record is a part of a stay. For example, the first two records for pat_ID is actually one stay as discharge_status is 3 in the first record. When discharge status is not 3 or missing, it means one patient was discharged somewhere else.
Any suggestion is greatly appreciated!
pat_ID | service_beg_dt | service_end_dt | discharge_status |
1 | 1/1/2018 | 3/31/2018 | 3 |
1 | 1/1/2018 | 4/30/2018 | 1 |
1 | 5/16/2018 | 5/31/2018 | 3 |
1 | 5/16/2018 | 6/30/2018 | 3 |
1 | 5/16/2018 | 7/31/2018 | 3 |
1 | 5/16/2018 | 8/31/2018 | 3 |
1 | 5/16/2018 | 9/30/2018 | 3 |
1 | 5/16/2018 | 10/31/2018 | 3 |
1 | 5/16/2018 | 11/30/2018 | 3 |
1 | 5/16/2018 | 12/31/2018 | 3 |
2 | 11/1/2017 | 1/31/2018 | 1 |
2 | 2/12/2018 | 5/1/2018 | 3 |
2 | 5/1/2018 | 5/31/2018 | 2 |
Given we don't have sample data in the form of a working data step, you'll need to test the code below:
data want (drop=service_beg_dt rename=(service_end_dt=stay_end_dt) );
do until (discharge_status^=3 or last.pat_id=1);
set have;
by pat_id;
stay_beg_dt=min(stay_beg_dt,service_beg_dt);
end;
length_of_stay=service_end_dt-stay_beg_dt;
format stay_beg_dt mmddyy10. ;
run;
This assumes that the only continuation indicator is DISCHARGE_STATUS=3, and that the data are sorted by PAT_ID/SERVICE_END_DT.
Can you please show us the desired output from this data?
I have to admit, I don't think the explanation is very clear of what a long-term stay is, can you explain in more detail?
Thanks for the response!
The desired output would be:
pat_ID | service_beg_dt | service_end_dt | discharge_status |
1 | 1/1/2018 | 4/30/2018 | 1 |
1 | 5/16/2018 | 12/31/2018 | 3 |
2 | 11/1/2017 | 1/31/2018 | 1 |
2 | 2/12/2018 | 5/31/2018 | 2 |
where the begin and end date of a stay are captured, and the final discharge status is also recorded. The time frame is Jan. 1, 2018 to Dec. 31, 2018, which is why for pat_ID=1 had the discharge status=3 in the second stay.
I am sure that i didn't fully understand the logic, but this step seems to produce the expected result:
data want;
set have;
if discharge_status ^= 3 or service_end_dt = '31Dec2018'd;
run;
The tables below may help. I should have used the title "Construct long-term stays". When the discharge status of a record is 3-still patient, it means the stay continues to the next record. When the discharge status is not 3 (1-discharge to home, 2-discharge to inpatient care), it means a stay ends.
For example, the first record for pat_ID=1 has discharge status 3 (still patient), which means the stay continues to the next record till 4/30/2018 when the patient discharged to home (discharge_status=1). The next stay for pat_ID=1 went from 5/16/2018 to 12/31/2018 as the rest of records all have discharge_status=3.
For pat_ID=2, the record with service dates 1/1/2018-1/31/2018 has discharge_status=1, so the single record is a complete stay. The second and third records are for one stay because the second record has discharge_status=3 (still patient).
Assuming that the study timeframe is 1/1/2018 - 12/31/2018.
Thank you!!
Original data
pat_ID | service_beg_dt | service_end_dt | discharge_status | Note |
1 | 1/1/2018 | 3/31/2018 | 3 | 3 - still patient |
1 | 1/1/2018 | 4/30/2018 | 1 | 1 - discharge to home |
1 | 5/16/2018 | 5/31/2018 | 3 | |
1 | 5/16/2018 | 6/30/2018 | 3 | |
1 | 5/16/2018 | 7/31/2018 | 3 | |
1 | 5/16/2018 | 8/31/2018 | 3 | |
1 | 5/16/2018 | 9/30/2018 | 3 | |
1 | 5/16/2018 | 10/31/2018 | 3 | |
1 | 5/16/2018 | 11/30/2018 | 3 | |
1 | 5/16/2018 | 12/31/2018 | 3 | end of study timeframe, still patient |
2 | 1/1/2018 | 1/31/2018 | 1 | 1 - discharge to home |
2 | 2/12/2018 | 5/1/2018 | 3 | 3 - still patient |
2 | 5/1/2018 | 5/31/2018 | 2 | 2 - discharge to inpatient care |
Desired output:
pat_ID | stay_beg_dt | stay_end_dt | discharge_status | Length_of_stay |
1 | 1/1/2018 | 4/30/2018 | 1 | 119 |
1 | 5/16/2018 | 12/31/2018 | 3 | 229 |
2 | 11/1/2017 | 1/31/2018 | 1 | 91 |
2 | 2/12/2018 | 5/31/2018 | 2 | 108 |
I hope the side-by-side tables of the original data and the desired output can make my intent a bit more clear. I should have used title "Construct long-term stays". In the desired output, the begin and end dates of a stay are identified from the original data.
For example, for pat_ID=1, the first record (1/1/2018 - 3/31/2018) has discharge status=3 (still patient), so the stay continues to the next record and the end date of the stay is 4/30/2018 as the patient is discharged to home.
The rest of records for pat_ID=1 go from 5/16/2018 to 12/31/2018 with the same discharge status 3-still patient, so the patient remains in the same stay during the period.
The first record for pat_ID=2 has discharge status 1 - discharged to home, so the single record is a complete stay. The rest of records are for another stay as the record with service dates 2/12/2018-5/1/2018 has discharge status 3-still patient, and it continued to the next record with service end date of 5/31/2018.
Length of stay is the number of days from stay begin date to stay end date.
Thank you very much!
Original Data Desired Output
pat_ID | service_beg_dt | service_end_dt | discharge_status | Note | pat_ID | stay_beg_dt | stay_end_dt | discharge_status | Length_of_stay |
1 | 1/1/2018 | 3/31/2018 | 3 | 3 - still patient | 1 | 1/1/2018 | 4/30/2018 | 1 | 119 |
1 | 1/1/2018 | 4/30/2018 | 1 | 1 - discharge to home | 1 | 5/16/2018 | 12/31/2018 | 3 | 229 |
1 | 5/16/2018 | 5/31/2018 | 3 | 2 | 11/1/2017 | 1/31/2018 | 1 | 91 | |
1 | 5/16/2018 | 6/30/2018 | 3 | 2 | 2/12/2018 | 5/31/2018 | 2 | 108 | |
1 | 5/16/2018 | 7/31/2018 | 3 | ||||||
1 | 5/16/2018 | 8/31/2018 | 3 | ||||||
1 | 5/16/2018 | 9/30/2018 | 3 | ||||||
1 | 5/16/2018 | 10/31/2018 | 3 | ||||||
1 | 5/16/2018 | 11/30/2018 | 3 | ||||||
1 | 5/16/2018 | 12/31/2018 | 3 | end of study timeframe, still patient | |||||
2 | 11/1/2017 | 1/31/2018 | 1 | 1 - discharge to home | |||||
2 | 2/12/2018 | 5/1/2018 | 3 | 3 - still patient | |||||
2 | 5/1/2018 | 5/31/2018 | 2 | 2 - discharge to inpatient care |
And now please post the original data as working data step using datalines or cards.
data have;
infile datalines truncover dsd;
input pat_ID (service_beg_dt service_end_dt) (:mmddyy10.) discharge_status;
format service_beg_dt service_end_dt mmddyy10.;
datalines;
1,1/1/2018,3/31/2018,3
1,1/1/2018,4/30/2018,1
1,5/16/2018,5/31/2018,3
1,5/16/2018,6/30/2018,3
1,5/16/2018,7/31/2018,3
1,5/16/2018,8/31/2018,3
1,5/16/2018,9/30/2018,3
1,5/16/2018,10/31/2018,3
1,5/16/2018,11/30/2018,3
1,5/16/2018,12/31/2018,3
2,1/1/2018,1/31/2018,1
2,2/12/2018,5/1/2018,3
2,5/1/2018,5/31/2018,2
;
Given we don't have sample data in the form of a working data step, you'll need to test the code below:
data want (drop=service_beg_dt rename=(service_end_dt=stay_end_dt) );
do until (discharge_status^=3 or last.pat_id=1);
set have;
by pat_id;
stay_beg_dt=min(stay_beg_dt,service_beg_dt);
end;
length_of_stay=service_end_dt-stay_beg_dt;
format stay_beg_dt mmddyy10. ;
run;
This assumes that the only continuation indicator is DISCHARGE_STATUS=3, and that the data are sorted by PAT_ID/SERVICE_END_DT.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.