Solved: Re: Include more inpatient stay info besides length of stay

lizzy28 · Posted 08-01-2022 07:49 PM

I previously asked how to construct inpatient stays and length of stay here: https://communities.sas.com/t5/SAS-Programming/Count-number-of-inpatient-stays-and-length-of-stay/m-....

On top of that, I'd like to obtain a bit more admission and discharge information per stay. Say, each encounter has a individual claim_id. For each inpatient stay, I wanted to keep the admission claim_id (claim_id when a person was admitted) and the discharge claim_id (claim_id when a person was discharged). The example data look like below (only column claim_id is added, the rest is the same as the old post).

PERSON_ID	provider_id	admission_dt	discharge_dt	claim_id	Note
1	A	8/1/2014	1/7/2018	12345	a stay starts (admission_dt) from previous years, 1/1/2018 will be used as initial admission_dt
1	A	8/1/2014	1/10/2018	12346	Overlapped stay, so it's considered as the same stay as the stay in last claim (last observation)
1	A	1/11/2018	1/31/2018	12347	Consecutive date (date difference<=1) from last observation, so the same inpatient stay
1	A	1/11/2018	2/28/2018	12348
1	A	1/11/2018	2/28/2018	12349
1	A	1/11/2018	3/31/2018	12350
1	A	1/11/2018	4/30/2018	12351
1	A	5/16/2018	5/31/2018	12352	Non-consecutive date from last admission, so a separate inpatient stay
1	A	5/16/2018	6/30/2018	12353
1	A	5/16/2018	7/31/2018	12354
1	A	5/16/2018	8/31/2018	12355
1	A	5/16/2018	9/30/2018	12356
1	A	5/16/2018	10/31/2018	12357
1	A	5/16/2018	11/30/2018	12358
1	A	5/16/2018	12/31/2018	12358
2	B	11/1/2017	1/31/2018	21234
2	B	12/24/2017	1/1/2018	21235
2	B	11/1/2017	2/28/2018	21236
2	B	11/1/2017	3/31/2018	21237
2	B	11/1/2017	4/30/2018	21238
2	B	11/1/2017	5/31/2018	21239
2	B	11/1/2017	6/30/2018	21240
2	B	11/1/2017	7/31/2018	21241
2	B	11/1/2017	8/31/2018	21242
2	B	11/1/2017	9/30/2018	21243
2	B	11/1/2017	10/5/2018	21244
2	C	10/6/2018	10/31/2018	21245	Consecutive date from last admission but with different provider, so a separate inpatient stay
2	C	10/5/2018	11/30/2018	21246
2	C	10/5/2018	12/31/2018	21247
3	D	1/27/2018	1/30/2018	31234
3	D	1/27/2018	2/28/2018	31235
3	E	3/15/2018		31236	A new inpatient stay at a different provider 15 days after the last stay
3	E	3/15/2018		31237	Missing discharge date means the person is not discharged per claim, so 12/31/2018 will be used as the discharge date for the stay
3	E	3/15/2018		31238
3	E	3/15/2018		31239
3	E	3/15/2018		31240
3	E	3/15/2018		31241
3	E	4/1/2018		31242
3	E	5/1/2018		31243
3	E	6/1/2018		31244
3	E	7/1/2018		31245
3	E	8/1/2018		31246
3	E	9/1/2018		31247
3	E	10/1/2018		31248
3	E	11/1/2018		31249
3	E	12/1/2018		31250

The target output would be

PERSON_ID	provider_id	admission_dt	discharge_dt	length_of_stay	claim_id_adm	claim_id_discharge
1	A	8/1/2014	4/30/2018	1368	12345	12351
1	A	5/16/2018	12/31/2018	229	12352	12358
2	B	11/1/2017	10/5/2018	338	21234	21244
2	C	10/5/2018	12/31/2018	87	21246	21247
3	D	1/27/2018	2/28/2018	32	31234	31235
3	E	3/15/2018	12/31/2018	291	31236	31250

where the last two columns are newly targeted column compared to the old post.

Much appreciated!

Kurt_Bremser · Posted 08-02-2022 03:48 AM

Please do yourself and us a BIG favor and always include your data as a data step with datalines, as I have already shown you. Not having to write that speed up the process of finding a solution, and it prevents any questions about variable attributes (lengths, formats) and real content.

You need only a slight expansion of the original logic, as the "cutover point" for the new variables is the same.

data have;
infile datalines dlm="09"x dsd truncover;
input
  person_id $
  provider_id $
  admission_dt :mmddyy10.
  discharge_dt :mmddyy10.
  claim_id :$5.
;
format
  admission_dt
  discharge_dt yymmdd10.
;
datalines;
1	A	8/1/2014	1/7/2018	12345
1	A	8/1/2014	1/10/2018	12346
1	A	1/11/2018	1/31/2018	12347
1	A	1/11/2018	2/28/2018	12348
1	A	1/11/2018	2/28/2018	12349
1	A	1/11/2018	3/31/2018	12350
1	A	1/11/2018	4/30/2018	12351
1	A	5/16/2018	5/31/2018	12352
1	A	5/16/2018	6/30/2018	12353
1	A	5/16/2018	7/31/2018	12354	 
1	A	5/16/2018	8/31/2018	12355	 
1	A	5/16/2018	9/30/2018	12356	 
1	A	5/16/2018	10/31/2018	12357	 
1	A	5/16/2018	11/30/2018	12358	 
1	A	5/16/2018	12/31/2018	12358	 
2	B	11/1/2017	1/31/2018	21234	 
2	B	12/24/2017	1/1/2018	21235	 
2	B	11/1/2017	2/28/2018	21236	 
2	B	11/1/2017	3/31/2018	21237	 
2	B	11/1/2017	4/30/2018	21238	 
2	B	11/1/2017	5/31/2018	21239	 
2	B	11/1/2017	6/30/2018	21240	 
2	B	11/1/2017	7/31/2018	21241	 
2	B	11/1/2017	8/31/2018	21242	 
2	B	11/1/2017	9/30/2018	21243	 
2	B	11/1/2017	10/5/2018	21244	 
2	C	10/6/2018	10/31/2018	21245	
2	C	10/5/2018	11/30/2018	21246	 
2	C	10/5/2018	12/31/2018	21247	 
3	D	1/27/2018	1/30/2018	31234	 
3	D	1/27/2018	2/28/2018	31235	 
3	E	3/15/2018	 	31236	
3	E	3/15/2018	 	31237	 
3	E	3/15/2018	 	31238
3	E	3/15/2018	 	31239
3	E	3/15/2018	 	31240
3	E	3/15/2018	 	31241
3	E	4/1/2018	 	31242
3	E	5/1/2018	 	31243
3	E	6/1/2018	 	31244
3	E	7/1/2018	 	31245
3	E	8/1/2018	 	31246
3	E	9/1/2018	 	31247
3	E	10/1/2018	 	31248
3	E	11/1/2018	 	31249
3	E	12/1/2018	 	31250
;

%let def_discharge = %sysfunc(inputn(2018-12-31,yymmdd10.));

data want;
set have;
by person_id provider_id;
retain _start length_of_stay claim_id_adm;
/* length_of stay is included here solely for the purpose of variable order, purely cosmetic! */
_dis = lag(discharge_dt);
_claim_id = lag(claim_id);
if first.provider_id
then do;
  _start = admission_dt;
  claim_id_adm = claim_id;
end;
if not first.provider_id and _dis ne . and admission_dt gt sum(_dis,1)
then do;
  _tmp = admission_dt;
  admission_dt = _start;
  _start = _tmp;
  _tmp = discharge_dt;
  discharge_dt = _dis;
  length_of_stay = discharge_dt - admission_dt;
  claim_id_discharge = _claim_id;
  output;
  discharge_dt = _tmp;
  claim_id_adm = claim_id;
end;
if last.provider_id
then do;
  admission_dt = _start;
  discharge_dt = coalesce(discharge_dt,&def_discharge.);
  length_of_stay = discharge_dt - admission_dt;
  claim_id_discharge = claim_id;
  output;
end;
drop _: claim_id;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

Kurt_Bremser · Posted 08-02-2022 03:48 AM

Please do yourself and us a BIG favor and always include your data as a data step with datalines, as I have already shown you. Not having to write that speed up the process of finding a solution, and it prevents any questions about variable attributes (lengths, formats) and real content.

You need only a slight expansion of the original logic, as the "cutover point" for the new variables is the same.

data have;
infile datalines dlm="09"x dsd truncover;
input
  person_id $
  provider_id $
  admission_dt :mmddyy10.
  discharge_dt :mmddyy10.
  claim_id :$5.
;
format
  admission_dt
  discharge_dt yymmdd10.
;
datalines;
1	A	8/1/2014	1/7/2018	12345
1	A	8/1/2014	1/10/2018	12346
1	A	1/11/2018	1/31/2018	12347
1	A	1/11/2018	2/28/2018	12348
1	A	1/11/2018	2/28/2018	12349
1	A	1/11/2018	3/31/2018	12350
1	A	1/11/2018	4/30/2018	12351
1	A	5/16/2018	5/31/2018	12352
1	A	5/16/2018	6/30/2018	12353
1	A	5/16/2018	7/31/2018	12354	 
1	A	5/16/2018	8/31/2018	12355	 
1	A	5/16/2018	9/30/2018	12356	 
1	A	5/16/2018	10/31/2018	12357	 
1	A	5/16/2018	11/30/2018	12358	 
1	A	5/16/2018	12/31/2018	12358	 
2	B	11/1/2017	1/31/2018	21234	 
2	B	12/24/2017	1/1/2018	21235	 
2	B	11/1/2017	2/28/2018	21236	 
2	B	11/1/2017	3/31/2018	21237	 
2	B	11/1/2017	4/30/2018	21238	 
2	B	11/1/2017	5/31/2018	21239	 
2	B	11/1/2017	6/30/2018	21240	 
2	B	11/1/2017	7/31/2018	21241	 
2	B	11/1/2017	8/31/2018	21242	 
2	B	11/1/2017	9/30/2018	21243	 
2	B	11/1/2017	10/5/2018	21244	 
2	C	10/6/2018	10/31/2018	21245	
2	C	10/5/2018	11/30/2018	21246	 
2	C	10/5/2018	12/31/2018	21247	 
3	D	1/27/2018	1/30/2018	31234	 
3	D	1/27/2018	2/28/2018	31235	 
3	E	3/15/2018	 	31236	
3	E	3/15/2018	 	31237	 
3	E	3/15/2018	 	31238
3	E	3/15/2018	 	31239
3	E	3/15/2018	 	31240
3	E	3/15/2018	 	31241
3	E	4/1/2018	 	31242
3	E	5/1/2018	 	31243
3	E	6/1/2018	 	31244
3	E	7/1/2018	 	31245
3	E	8/1/2018	 	31246
3	E	9/1/2018	 	31247
3	E	10/1/2018	 	31248
3	E	11/1/2018	 	31249
3	E	12/1/2018	 	31250
;

%let def_discharge = %sysfunc(inputn(2018-12-31,yymmdd10.));

data want;
set have;
by person_id provider_id;
retain _start length_of_stay claim_id_adm;
/* length_of stay is included here solely for the purpose of variable order, purely cosmetic! */
_dis = lag(discharge_dt);
_claim_id = lag(claim_id);
if first.provider_id
then do;
  _start = admission_dt;
  claim_id_adm = claim_id;
end;
if not first.provider_id and _dis ne . and admission_dt gt sum(_dis,1)
then do;
  _tmp = admission_dt;
  admission_dt = _start;
  _start = _tmp;
  _tmp = discharge_dt;
  discharge_dt = _dis;
  length_of_stay = discharge_dt - admission_dt;
  claim_id_discharge = _claim_id;
  output;
  discharge_dt = _tmp;
  claim_id_adm = claim_id;
end;
if last.provider_id
then do;
  admission_dt = _start;
  discharge_dt = coalesce(discharge_dt,&def_discharge.);
  length_of_stay = discharge_dt - admission_dt;
  claim_id_discharge = claim_id;
  output;
end;
drop _: claim_id;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Adriaan_Gouws · Posted 08-02-2022 03:58 AM

Good day

Please see below for an approach. You should be able to leverage the below code. The approach has three steps:

Create a new table with your keys, and derive the length of stay
Create a new table where you isolate the claim IDs, this requires context to your work
Join the claim IDs onto the base file, be careful here to not duplicate, you will notice I've added a few joining keys to ensure the data comes back correctly

proc sql;
	create table want_base_file as 
		select distinct
			PERSON_ID
			,provider_id
			,admission_dt
			,max(discharge_dt) format=date9. as final_discharge_dt
			,intck('day', admission_dt, calculated final_discharge_dt) as length_of_stay
		from Data_Import_20220802
			group by PERSON_ID
				,provider_id
				,admission_dt;
quit;

proc sql;
	create table want_claim_id_flag as 
		select distinct
			PERSON_ID
			,provider_id
			,admission_dt
			,claim_id
			,(case 
				when claim_id=12345 then 'A' 
				when claim_id=12346 then 'D' 
				else '' end) as claim_flag
		from Data_Import_20220802;
quit;

proc sql;
	create table want_final as 
		select distinct 
			a.*
			,b.claim_id as claim_id_adm
			,c.claim_id as claim_id_discharge
		from want_base_file as a
			left join want_claim_id_flag as b 
				on (a.PERSON_ID=b.PERSON_ID 
					and a.provider_id=b.provider_id 
					and a.admission_dt=b.admission_dt
					and b.claim_flag='A')
			left join want_claim_id_flag as C 
				on (a.PERSON_ID=c.PERSON_ID 
					and a.provider_id=c.provider_id 
					and a.admission_dt=c.admission_dt
					and c.claim_flag='D');
quit;

Kurt_Bremser · Posted 08-02-2022 04:30 AM

How do you plan to make this

			,(case 
				when claim_id=12345 then 'A' 
				when claim_id=12346 then 'D' 
				else '' end) as claim_flag

dynamic to work for all stays?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Adriaan_Gouws · Posted 08-02-2022 05:13 AM

Good day Kurt

Hope you're doing well.

From a dynamic perspective, the approach there would be to focus on the context of the work. Ideally, it would be more sustainable to write it slightly different, based on data inputs from users. Below is simply an example.

proc sql;
	create table want_claim_id_flag_dynamic as 
		select distinct
			PERSON_ID
			,provider_id
			,admission_dt
			,claim_id
			,(case 
				when upcase(note) like '%ADMISSION%' then 'A' 
				when upcase(note) like '%DISCHARGE%' then 'D' 
				else '' end) as claim_flag_dynamic
		from Data_Import_20220802;
quit;

lizzy28 · Posted 08-02-2022 06:07 PM

Thanks again, Kurt! This works very well!

Could you explain a bit about the code below? I have difficulty in understanding the part conceptually.
if not first.provider_id and _dis ne . and admission_dt gt sum(_dis,1)
then do;
_tmp = admission_dt;
admission_dt = _start;
_start = _tmp;
_tmp = discharge_dt;
discharge_dt = _dis;
length_of_stay = discharge_dt - admission_dt;
claim_id_discharge = _claim_id;
output;
discharge_dt = _tmp;
claim_id_adm = claim_id;

Kurt_Bremser · Posted 08-03-2022 03:24 AM

if not first.provider_id and _dis ne . and admission_dt gt sum(_dis,1)
/* this condition becomes true when
- we do not have the first observation of a group (so the previous observation must have been from the same group
- the previous discharge date is not missing (which would indicate a stay without end)
- and the difference between the previous discharge date and the current admission date is greater than 1
the SUM() function is used to prevent "missing value" NOTEs */
then do;
  /* since we recognize the end of a stay after the fact, we need to do some juggling with values */
  _tmp = admission_dt; /* save the current admission date */
  admission_dt = _start; set admission date from the RETAINed value */
  _start = _tmp; /* set the new start from the current admission date */
  _tmp = discharge_dt; /* save the current discharge date */
  discharge_dt = _dis; /* set the discharge date from the previous observation which marks the end of a stay */
  length_of_stay = discharge_dt - admission_dt; /* calculate */
  claim_id_discharge = _claim_id; /* set the claim from the previous observation */
  output; /* write to output */
  discharge_dt = _tmp; /* restore the current value (is necessary if the current observation is the last of a group */
  claim_id_adm = claim_id; /* set the claim for the start of a stay */
end;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

SAS Innovate 2025: Call for Content

Classroom Training Available!