Hello - I have a dataset with date admit and date discharge. There are cases where the patient may have been admitted again without ever discharging. In order to figure out the number of patients returning, I need to flag those that never actually discharged. So for example, since this patient never discharged and was re-admitted on 1/23, I want observation 2 and 3 to be flagged as a patient who never discharged. The issue is each group (Patient ID) has a different number of observations. I couldn't figure out how to use lag to look back n times. It seems to need a specific number. I also can't just use first.ID or last.ID because the missing date may occur in the middle of the group. Hope this makes sense!
| Patient | Date___Admit | Date___Discharge | n |
| 1 | 3/26/2019 | . | 1 |
| 1 | 1/23/2020 | 6/19/2020 | 2 |
| 1 | 8/21/2020 | 7/19/2021 | 3 |
Join with itself, with a filter on the second dataset:
data want;
merge
have
have (
in=t2
keep=patient date_discharge
rename=(date_discharge=disc)
where=(disc = .)
)
;
if t2 then non_disc = 1;
drop disc;
run;
Join with itself, with a filter on the second dataset:
data want;
merge
have
have (
in=t2
keep=patient date_discharge
rename=(date_discharge=disc)
where=(disc = .)
)
;
if t2 then non_disc = 1;
drop disc;
run;
Thank you! This worked great! Only thing is, I left something out. I have to define: new patients and returning patients. Those that you helped me with are neither new or returning. They have an admission with no discharge, therefore they're still receiving services in some program. The problem I'm running into is, if I use the merge technique, it overrides some of my defined returning patients. So for example, Patient 1 is a return patient when they're admitted on 7/1/2021, but it's flagged as a neither. Your technique helped me with Patient 2, as using lag does not work for that one (note: the patient was admitted on 7/10/20 - prior to their discharge date on 11/10/20 - that's why this patient is considered neither a new or retuning patient). Do you know of a way to fix this issue?
| Date___Admit | Date___Discharge | ID___Patient | n | WANT |
| 9/27/2019 | 1/10/2020 | 1 | 1 | First_Time |
| 1/28/2021 | 4/27/2021 | 1 | 2 | Return |
| 7/1/2021 | . | 1 | 3 | Return |
| 7/13/2021 | . | 1 | 4 | . |
| 7/2/2020 | 11/10/2020 | 2 | 1 | First_Time |
| 7/13/2020 | 4/6/2021 | 2 | 2 | . |
| 10/22/2020 | . | 2 | 3 | . |
| 11/10/2020 | 3/31/2021 | 2 | 4 | . |
| 5/26/2021 | . | 2 | 5 | . |
IMO, your last observation of patient 2 should be flagged "Return".
This code does it:
data have;
infile datalines dlm="09"x dsd truncover;
input
date_admit :mmddyy10.
date_discharge :mmddyy10.
id_patient $
n
;
format
date_admit
date_discharge yymmdd10.
;
datalines;
9/27/2019 1/10/2020 1 1
1/28/2021 4/27/2021 1 2
7/1/2021 . 1 3
7/13/2021 . 1 4 .
7/2/2020 11/10/2020 2 1
7/13/2020 4/6/2021 2 2
10/22/2020 . 2 3
11/10/2020 3/31/2021 2 4
5/26/2021 . 2 5
;
data want;
set have;
by id_patient;
retain l_disc;
if first.id_patient
then do;
want = "First time";
l_disc = .;
end;
else do;
if l_disc ne . and date_admit > l_disc then want = "Return";
end;
if date_discharge ne . then l_disc = max(l_disc,date_discharge);
if coalesce(date_discharge,lag(date_discharge)) = . then want = "";
drop l_disc;
run;
Note how example data is presented in readily usable form as a data step with datalines; please do so yourself in the future.
Nearly 200 sessions are now available on demand in the Innovate Hub.
Watch Now →SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.