Hi all,
I need to count the number of inpatient stays and calculate length of stay per stay from admission date (admission_dt) and discharge date (discharge_dt). The tricky part is that when a new stay starts days after last stay but with the same provider, it is hard for me to separate the new stay from the previous one. Any suggestion would be greatly appreciated!
Below is the example data, and the note column shows the counting and calculation rules.
PERSON_ID | provider_id | admission_dt | discharge_dt | Note |
1 | A | 8/1/2014 | 1/7/2018 | a stay starts (admission_dt) from previous years, 1/1/2018 will be used as initial admission_dt |
1 | A | 8/1/2014 | 1/10/2018 | Overlapped stay, so it's considered as the same stay as the stay in last claim (last observation) |
1 | A | 1/11/2018 | 1/31/2018 | Consecutive date (date difference<=1) from last observation, so the same inpatient stay |
1 | A | 1/11/2018 | 2/28/2018 | |
1 | A | 1/11/2018 | 2/28/2018 | |
1 | A | 1/11/2018 | 3/31/2018 | |
1 | A | 1/11/2018 | 4/30/2018 | |
1 | A | 5/16/2018 | 5/31/2018 | Non-consecutive date from last admission, so a separate inpatient stay |
1 | A | 5/16/2018 | 6/30/2018 | |
1 | A | 5/16/2018 | 7/31/2018 | |
1 | A | 5/16/2018 | 8/31/2018 | |
1 | A | 5/16/2018 | 9/30/2018 | |
1 | A | 5/16/2018 | 10/31/2018 | |
1 | A | 5/16/2018 | 11/30/2018 | |
1 | A | 5/16/2018 | 12/31/2018 | |
2 | B | 11/1/2017 | 1/31/2018 | |
2 | B | 12/24/2017 | 1/1/2018 | |
2 | B | 11/1/2017 | 2/28/2018 | |
2 | B | 11/1/2017 | 3/31/2018 | |
2 | B | 11/1/2017 | 4/30/2018 | |
2 | B | 11/1/2017 | 5/31/2018 | |
2 | B | 11/1/2017 | 6/30/2018 | |
2 | B | 11/1/2017 | 7/31/2018 | |
2 | B | 11/1/2017 | 8/31/2018 | |
2 | B | 11/1/2017 | 9/30/2018 | |
2 | B | 11/1/2017 | 10/5/2018 | |
2 | C | 10/6/2018 | 10/31/2018 | Consecutive date from last admission but with different provider, so a separate inpatient stay |
2 | C | 10/5/2018 | 11/30/2018 | |
2 | C | 10/5/2018 | 12/31/2018 | |
3 | D | 1/27/2018 | 1/30/2018 | |
3 | D | 1/27/2018 | 2/28/2018 | |
3 | E | 3/15/2018 | A new inpatient stay at a different provider 15 days after the last stay | |
3 | E | 3/15/2018 | Missing discharge date means the person is not discharged per claim, so 12/31/2018 will be used as the discharge date for the stay | |
3 | E | 3/15/2018 | ||
3 | E | 3/15/2018 | ||
3 | E | 3/15/2018 | ||
3 | E | 3/15/2018 | ||
3 | E | 4/1/2018 | ||
3 | E | 5/1/2018 | ||
3 | E | 6/1/2018 | ||
3 | E | 7/1/2018 | ||
3 | E | 8/1/2018 | ||
3 | E | 9/1/2018 | ||
3 | E | 10/1/2018 | ||
3 | E | 11/1/2018 | ||
3 | E | 12/1/2018 |
See this:
data have;
infile datalines dlm="09"x dsd truncover;
input
person_id $
provider_id $
admission_dt :mmddyy10.
discharge_dt :mmddyy10.
;
format
admission_dt
discharge_dt yymmdd10.
;
datalines;
1 A 8/1/2014 1/7/2018
1 A 8/1/2014 1/10/2018
1 A 1/11/2018 1/31/2018
1 A 1/11/2018 2/28/2018
1 A 1/11/2018 2/28/2018
1 A 1/11/2018 3/31/2018
1 A 1/11/2018 4/30/2018
1 A 5/16/2018 5/31/2018
1 A 5/16/2018 6/30/2018
1 A 5/16/2018 7/31/2018
1 A 5/16/2018 8/31/2018
1 A 5/16/2018 9/30/2018
1 A 5/16/2018 10/31/2018
1 A 5/16/2018 11/30/2018
1 A 5/16/2018 12/31/2018
2 B 11/1/2017 1/31/2018
2 B 12/24/2017 1/1/2018
2 B 11/1/2017 2/28/2018
2 B 11/1/2017 3/31/2018
2 B 11/1/2017 4/30/2018
2 B 11/1/2017 5/31/2018
2 B 11/1/2017 6/30/2018
2 B 11/1/2017 7/31/2018
2 B 11/1/2017 8/31/2018
2 B 11/1/2017 9/30/2018
2 B 11/1/2017 10/5/2018
2 C 10/6/2018 10/31/2018
2 C 10/5/2018 11/30/2018
2 C 10/5/2018 12/31/2018
3 D 1/27/2018 1/30/2018
3 D 1/27/2018 2/28/2018
3 E 3/15/2018
3 E 3/15/2018
3 E 3/15/2018
3 E 3/15/2018
3 E 3/15/2018
3 E 3/15/2018
3 E 4/1/2018
3 E 5/1/2018
3 E 6/1/2018
3 E 7/1/2018
3 E 8/1/2018
3 E 9/1/2018
3 E 10/1/2018
3 E 11/1/2018
3 E 12/1/2018
;
%let def_discharge = %sysfunc(inputn(2018-12-31,yymmdd10.));
data want;
set have;
by person_id provider_id;
retain _start;
_dis = lag(discharge_dt);
if first.provider_id then _start = admission_dt;
if not first.provider_id and _dis ne . and admission_dt gt sum(_dis,1)
then do;
_tmp = admission_dt;
admission_dt = _start;
_start = _tmp;
_tmp = discharge_dt;
discharge_dt = _dis;
length_of_stay = discharge_dt - admission_dt;
output;
discharge_dt = _tmp;
end;
if last.provider_id
then do;
admission_dt = _start;
discharge_dt = coalesce(discharge_dt,&def_discharge.);
length_of_stay = discharge_dt - admission_dt;
output;
end;
drop _:;
run;
Please note how example data is presented in usable form (data step with datalines), and the use of internationally standardized date formats (YMD dates are always unambiguous, 3-1-2018 could be March 1 or January 3).
Sorry for nobody answering to your question up till now, usually you should get a faster answer.
See this:
data have;
infile datalines dlm="09"x dsd truncover;
input
person_id $
provider_id $
admission_dt :mmddyy10.
discharge_dt :mmddyy10.
;
format
admission_dt
discharge_dt yymmdd10.
;
datalines;
1 A 8/1/2014 1/7/2018
1 A 8/1/2014 1/10/2018
1 A 1/11/2018 1/31/2018
1 A 1/11/2018 2/28/2018
1 A 1/11/2018 2/28/2018
1 A 1/11/2018 3/31/2018
1 A 1/11/2018 4/30/2018
1 A 5/16/2018 5/31/2018
1 A 5/16/2018 6/30/2018
1 A 5/16/2018 7/31/2018
1 A 5/16/2018 8/31/2018
1 A 5/16/2018 9/30/2018
1 A 5/16/2018 10/31/2018
1 A 5/16/2018 11/30/2018
1 A 5/16/2018 12/31/2018
2 B 11/1/2017 1/31/2018
2 B 12/24/2017 1/1/2018
2 B 11/1/2017 2/28/2018
2 B 11/1/2017 3/31/2018
2 B 11/1/2017 4/30/2018
2 B 11/1/2017 5/31/2018
2 B 11/1/2017 6/30/2018
2 B 11/1/2017 7/31/2018
2 B 11/1/2017 8/31/2018
2 B 11/1/2017 9/30/2018
2 B 11/1/2017 10/5/2018
2 C 10/6/2018 10/31/2018
2 C 10/5/2018 11/30/2018
2 C 10/5/2018 12/31/2018
3 D 1/27/2018 1/30/2018
3 D 1/27/2018 2/28/2018
3 E 3/15/2018
3 E 3/15/2018
3 E 3/15/2018
3 E 3/15/2018
3 E 3/15/2018
3 E 3/15/2018
3 E 4/1/2018
3 E 5/1/2018
3 E 6/1/2018
3 E 7/1/2018
3 E 8/1/2018
3 E 9/1/2018
3 E 10/1/2018
3 E 11/1/2018
3 E 12/1/2018
;
%let def_discharge = %sysfunc(inputn(2018-12-31,yymmdd10.));
data want;
set have;
by person_id provider_id;
retain _start;
_dis = lag(discharge_dt);
if first.provider_id then _start = admission_dt;
if not first.provider_id and _dis ne . and admission_dt gt sum(_dis,1)
then do;
_tmp = admission_dt;
admission_dt = _start;
_start = _tmp;
_tmp = discharge_dt;
discharge_dt = _dis;
length_of_stay = discharge_dt - admission_dt;
output;
discharge_dt = _tmp;
end;
if last.provider_id
then do;
admission_dt = _start;
discharge_dt = coalesce(discharge_dt,&def_discharge.);
length_of_stay = discharge_dt - admission_dt;
output;
end;
drop _:;
run;
Please note how example data is presented in usable form (data step with datalines), and the use of internationally standardized date formats (YMD dates are always unambiguous, 3-1-2018 could be March 1 or January 3).
Sorry for nobody answering to your question up till now, usually you should get a faster answer.
Hi Kurt,
Thank you for the solution! The code works well. However, I have trouble in understanding the part as below
if not first.provider_id and _dis ne . and admission_dt gt sum(_dis,1)
then do;
_tmp = admission_dt;
admission_dt = _start;
_start = _tmp;
_tmp = discharge_dt;
discharge_dt = _dis;
length_of_stay = discharge_dt - admission_dt;
output;
discharge_dt = _tmp;
end;
I actually tried to add a bit more code to include claim id info. That is, if there is another column called claim_id, I wanted to keep admission claim_id and discharge claim_id per stay.
The example data would be
PERSON_ID | provider_id | admission_dt | discharge_dt | claim_id | Note |
1 | A | 8/1/2014 | 1/7/2018 | 12345 | a stay starts (admission_dt) from previous years, 1/1/2018 will be used as initial admission_dt |
1 | A | 8/1/2014 | 1/10/2018 | 12346 | Overlapped stay, so it's considered as the same stay as the stay in last claim (last observation) |
1 | A | 1/11/2018 | 1/31/2018 | 12347 | Consecutive date (date difference<=1) from last observation, so the same inpatient stay |
1 | A | 1/11/2018 | 2/28/2018 | 12348 | |
1 | A | 1/11/2018 | 2/28/2018 | 12349 | |
1 | A | 1/11/2018 | 3/31/2018 | 12350 | |
1 | A | 1/11/2018 | 4/30/2018 | 12351 | |
1 | A | 5/16/2018 | 5/31/2018 | 12352 | Non-consecutive date from last admission, so a separate inpatient stay |
1 | A | 5/16/2018 | 6/30/2018 | 12353 | |
1 | A | 5/16/2018 | 7/31/2018 | 12354 | |
1 | A | 5/16/2018 | 8/31/2018 | 12355 | |
1 | A | 5/16/2018 | 9/30/2018 | 12356 | |
1 | A | 5/16/2018 | 10/31/2018 | 12357 | |
1 | A | 5/16/2018 | 11/30/2018 | 12358 | |
1 | A | 5/16/2018 | 12/31/2018 | 12358 | |
2 | B | 11/1/2017 | 1/31/2018 | 21234 | |
2 | B | 12/24/2017 | 1/1/2018 | 21235 | |
2 | B | 11/1/2017 | 2/28/2018 | 21236 | |
2 | B | 11/1/2017 | 3/31/2018 | 21237 | |
2 | B | 11/1/2017 | 4/30/2018 | 21238 | |
2 | B | 11/1/2017 | 5/31/2018 | 21239 | |
2 | B | 11/1/2017 | 6/30/2018 | 21240 | |
2 | B | 11/1/2017 | 7/31/2018 | 21241 | |
2 | B | 11/1/2017 | 8/31/2018 | 21242 | |
2 | B | 11/1/2017 | 9/30/2018 | 21243 | |
2 | B | 11/1/2017 | 10/5/2018 | 21244 | |
2 | C | 10/6/2018 | 10/31/2018 | 21245 | Consecutive date from last admission but with different provider, so a separate inpatient stay |
2 | C | 10/5/2018 | 11/30/2018 | 21246 | |
2 | C | 10/5/2018 | 12/31/2018 | 21247 | |
3 | D | 1/27/2018 | 1/30/2018 | 31234 | |
3 | D | 1/27/2018 | 2/28/2018 | 31235 | |
3 | E | 3/15/2018 | 31236 | A new inpatient stay at a different provider 15 days after the last stay | |
3 | E | 3/15/2018 | 31237 | Missing discharge date means the person is not discharged per claim, so 12/31/2018 will be used as the discharge date for the stay | |
3 | E | 3/15/2018 | 31238 | ||
3 | E | 3/15/2018 | 31239 | ||
3 | E | 3/15/2018 | 31240 | ||
3 | E | 3/15/2018 | 31241 | ||
3 | E | 4/1/2018 | 31242 | ||
3 | E | 5/1/2018 | 31243 | ||
3 | E | 6/1/2018 | 31244 | ||
3 | E | 7/1/2018 | 31245 | ||
3 | E | 8/1/2018 | 31246 | ||
3 | E | 9/1/2018 | 31247 | ||
3 | E | 10/1/2018 | 31248 | ||
3 | E | 11/1/2018 | 31249 | ||
3 | E | 12/1/2018 | 31250 |
The target data would be
PERSON_ID | provider_id | admission_dt | discharge_dt | claim_id_adm | claim_id_discharge | length_of_stay |
1 | A | 8/1/2014 | 4/30/2018 | 12345 | 12351 | 1368 |
1 | A | 5/16/2018 | 12/31/2018 | 12352 | 12358 | 229 |
2 | B | 11/1/2017 | 10/5/2018 | 21234 | 21244 | 338 |
2 | C | 10/5/2018 | 12/31/2018 | 21246 | 21247 | 87 |
3 | D | 1/27/2018 | 2/28/2018 | 31234 | 31235 | 32 |
3 | E | 3/15/2018 | 12/31/2018 | 31236 | 31250 | 291 |
Any further guidance is greatly appreciated!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.