BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
lizzy28
Quartz | Level 8

I previously asked how to construct inpatient stays and length of stay here: https://communities.sas.com/t5/SAS-Programming/Count-number-of-inpatient-stays-and-length-of-stay/m-....

 

On top of that, I'd like to obtain a bit more admission and discharge information per stay. Say, each encounter has a individual claim_id. For each inpatient stay, I wanted to keep the admission claim_id (claim_id when a person was admitted) and the discharge claim_id (claim_id when a person was discharged). The example data look like below (only column claim_id is added, the rest is the same as the old post). 

PERSON_ID provider_id admission_dt discharge_dt claim_id Note
1 A 8/1/2014 1/7/2018 12345 a stay starts (admission_dt) from previous years, 1/1/2018 will be used as initial admission_dt
1 A 8/1/2014 1/10/2018 12346 Overlapped stay, so it's considered as the same stay as the stay in last claim (last observation)
1 A 1/11/2018 1/31/2018 12347 Consecutive date (date difference<=1) from last observation, so the same inpatient stay
1 A 1/11/2018 2/28/2018 12348  
1 A 1/11/2018 2/28/2018 12349  
1 A 1/11/2018 3/31/2018 12350  
1 A 1/11/2018 4/30/2018 12351  
1 A 5/16/2018 5/31/2018 12352 Non-consecutive date from last admission, so a separate inpatient stay
1 A 5/16/2018 6/30/2018 12353  
1 A 5/16/2018 7/31/2018 12354  
1 A 5/16/2018 8/31/2018 12355  
1 A 5/16/2018 9/30/2018 12356  
1 A 5/16/2018 10/31/2018 12357  
1 A 5/16/2018 11/30/2018 12358  
1 A 5/16/2018 12/31/2018 12358  
2 B 11/1/2017 1/31/2018 21234  
2 B 12/24/2017 1/1/2018 21235  
2 B 11/1/2017 2/28/2018 21236  
2 B 11/1/2017 3/31/2018 21237  
2 B 11/1/2017 4/30/2018 21238  
2 B 11/1/2017 5/31/2018 21239  
2 B 11/1/2017 6/30/2018 21240  
2 B 11/1/2017 7/31/2018 21241  
2 B 11/1/2017 8/31/2018 21242  
2 B 11/1/2017 9/30/2018 21243  
2 B 11/1/2017 10/5/2018 21244  
2 C 10/6/2018 10/31/2018 21245 Consecutive date from last admission but with different provider, so a separate inpatient stay
2 C 10/5/2018 11/30/2018 21246  
2 C 10/5/2018 12/31/2018 21247  
3 D 1/27/2018 1/30/2018 31234  
3 D 1/27/2018 2/28/2018 31235  
3 E 3/15/2018   31236 A new inpatient stay at a different provider 15 days after the last stay
3 E 3/15/2018   31237

Missing discharge date means the person is not discharged per claim, so 12/31/2018 will be used as the discharge date for the stay

 

3

E 3/15/2018   31238
3 E 3/15/2018   31239
3 E 3/15/2018   31240
3 E 3/15/2018   31241
3 E 4/1/2018   31242
3 E 5/1/2018   31243
3 E 6/1/2018   31244
3 E 7/1/2018   31245
3 E 8/1/2018   31246
3 E 9/1/2018   31247
3 E 10/1/2018   31248
3 E 11/1/2018   31249
3 E 12/1/2018   31250

 

The target output would be 

PERSON_ID provider_id admission_dt discharge_dt length_of_stay claim_id_adm claim_id_discharge
1 A 8/1/2014 4/30/2018 1368 12345 12351
1 A 5/16/2018 12/31/2018 229 12352 12358
2 B 11/1/2017 10/5/2018 338 21234 21244
2 C 10/5/2018 12/31/2018 87 21246 21247
3 D 1/27/2018 2/28/2018 32 31234 31235
3 E 3/15/2018 12/31/2018 291 31236 31250

where the last two columns are newly targeted column compared to the old post. 

 

Much appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Please do yourself and us a BIG favor and always include your data as a data step with datalines, as I have already shown you. Not having to write that speed up the process of finding a solution, and it prevents any questions about variable attributes (lengths, formats) and real content.

You need only a slight expansion of the original logic, as the "cutover point" for the new variables is the same.

data have;
infile datalines dlm="09"x dsd truncover;
input
  person_id $
  provider_id $
  admission_dt :mmddyy10.
  discharge_dt :mmddyy10.
  claim_id :$5.
;
format
  admission_dt
  discharge_dt yymmdd10.
;
datalines;
1	A	8/1/2014	1/7/2018	12345
1	A	8/1/2014	1/10/2018	12346
1	A	1/11/2018	1/31/2018	12347
1	A	1/11/2018	2/28/2018	12348
1	A	1/11/2018	2/28/2018	12349
1	A	1/11/2018	3/31/2018	12350
1	A	1/11/2018	4/30/2018	12351
1	A	5/16/2018	5/31/2018	12352
1	A	5/16/2018	6/30/2018	12353
1	A	5/16/2018	7/31/2018	12354	 
1	A	5/16/2018	8/31/2018	12355	 
1	A	5/16/2018	9/30/2018	12356	 
1	A	5/16/2018	10/31/2018	12357	 
1	A	5/16/2018	11/30/2018	12358	 
1	A	5/16/2018	12/31/2018	12358	 
2	B	11/1/2017	1/31/2018	21234	 
2	B	12/24/2017	1/1/2018	21235	 
2	B	11/1/2017	2/28/2018	21236	 
2	B	11/1/2017	3/31/2018	21237	 
2	B	11/1/2017	4/30/2018	21238	 
2	B	11/1/2017	5/31/2018	21239	 
2	B	11/1/2017	6/30/2018	21240	 
2	B	11/1/2017	7/31/2018	21241	 
2	B	11/1/2017	8/31/2018	21242	 
2	B	11/1/2017	9/30/2018	21243	 
2	B	11/1/2017	10/5/2018	21244	 
2	C	10/6/2018	10/31/2018	21245	
2	C	10/5/2018	11/30/2018	21246	 
2	C	10/5/2018	12/31/2018	21247	 
3	D	1/27/2018	1/30/2018	31234	 
3	D	1/27/2018	2/28/2018	31235	 
3	E	3/15/2018	 	31236	
3	E	3/15/2018	 	31237	 
3	E	3/15/2018	 	31238
3	E	3/15/2018	 	31239
3	E	3/15/2018	 	31240
3	E	3/15/2018	 	31241
3	E	4/1/2018	 	31242
3	E	5/1/2018	 	31243
3	E	6/1/2018	 	31244
3	E	7/1/2018	 	31245
3	E	8/1/2018	 	31246
3	E	9/1/2018	 	31247
3	E	10/1/2018	 	31248
3	E	11/1/2018	 	31249
3	E	12/1/2018	 	31250
;

%let def_discharge = %sysfunc(inputn(2018-12-31,yymmdd10.));

data want;
set have;
by person_id provider_id;
retain _start length_of_stay claim_id_adm;
/* length_of stay is included here solely for the purpose of variable order, purely cosmetic! */
_dis = lag(discharge_dt);
_claim_id = lag(claim_id);
if first.provider_id
then do;
  _start = admission_dt;
  claim_id_adm = claim_id;
end;
if not first.provider_id and _dis ne . and admission_dt gt sum(_dis,1)
then do;
  _tmp = admission_dt;
  admission_dt = _start;
  _start = _tmp;
  _tmp = discharge_dt;
  discharge_dt = _dis;
  length_of_stay = discharge_dt - admission_dt;
  claim_id_discharge = _claim_id;
  output;
  discharge_dt = _tmp;
  claim_id_adm = claim_id;
end;
if last.provider_id
then do;
  admission_dt = _start;
  discharge_dt = coalesce(discharge_dt,&def_discharge.);
  length_of_stay = discharge_dt - admission_dt;
  claim_id_discharge = claim_id;
  output;
end;
drop _: claim_id;
run;

View solution in original post

6 REPLIES 6
Kurt_Bremser
Super User

Please do yourself and us a BIG favor and always include your data as a data step with datalines, as I have already shown you. Not having to write that speed up the process of finding a solution, and it prevents any questions about variable attributes (lengths, formats) and real content.

You need only a slight expansion of the original logic, as the "cutover point" for the new variables is the same.

data have;
infile datalines dlm="09"x dsd truncover;
input
  person_id $
  provider_id $
  admission_dt :mmddyy10.
  discharge_dt :mmddyy10.
  claim_id :$5.
;
format
  admission_dt
  discharge_dt yymmdd10.
;
datalines;
1	A	8/1/2014	1/7/2018	12345
1	A	8/1/2014	1/10/2018	12346
1	A	1/11/2018	1/31/2018	12347
1	A	1/11/2018	2/28/2018	12348
1	A	1/11/2018	2/28/2018	12349
1	A	1/11/2018	3/31/2018	12350
1	A	1/11/2018	4/30/2018	12351
1	A	5/16/2018	5/31/2018	12352
1	A	5/16/2018	6/30/2018	12353
1	A	5/16/2018	7/31/2018	12354	 
1	A	5/16/2018	8/31/2018	12355	 
1	A	5/16/2018	9/30/2018	12356	 
1	A	5/16/2018	10/31/2018	12357	 
1	A	5/16/2018	11/30/2018	12358	 
1	A	5/16/2018	12/31/2018	12358	 
2	B	11/1/2017	1/31/2018	21234	 
2	B	12/24/2017	1/1/2018	21235	 
2	B	11/1/2017	2/28/2018	21236	 
2	B	11/1/2017	3/31/2018	21237	 
2	B	11/1/2017	4/30/2018	21238	 
2	B	11/1/2017	5/31/2018	21239	 
2	B	11/1/2017	6/30/2018	21240	 
2	B	11/1/2017	7/31/2018	21241	 
2	B	11/1/2017	8/31/2018	21242	 
2	B	11/1/2017	9/30/2018	21243	 
2	B	11/1/2017	10/5/2018	21244	 
2	C	10/6/2018	10/31/2018	21245	
2	C	10/5/2018	11/30/2018	21246	 
2	C	10/5/2018	12/31/2018	21247	 
3	D	1/27/2018	1/30/2018	31234	 
3	D	1/27/2018	2/28/2018	31235	 
3	E	3/15/2018	 	31236	
3	E	3/15/2018	 	31237	 
3	E	3/15/2018	 	31238
3	E	3/15/2018	 	31239
3	E	3/15/2018	 	31240
3	E	3/15/2018	 	31241
3	E	4/1/2018	 	31242
3	E	5/1/2018	 	31243
3	E	6/1/2018	 	31244
3	E	7/1/2018	 	31245
3	E	8/1/2018	 	31246
3	E	9/1/2018	 	31247
3	E	10/1/2018	 	31248
3	E	11/1/2018	 	31249
3	E	12/1/2018	 	31250
;

%let def_discharge = %sysfunc(inputn(2018-12-31,yymmdd10.));

data want;
set have;
by person_id provider_id;
retain _start length_of_stay claim_id_adm;
/* length_of stay is included here solely for the purpose of variable order, purely cosmetic! */
_dis = lag(discharge_dt);
_claim_id = lag(claim_id);
if first.provider_id
then do;
  _start = admission_dt;
  claim_id_adm = claim_id;
end;
if not first.provider_id and _dis ne . and admission_dt gt sum(_dis,1)
then do;
  _tmp = admission_dt;
  admission_dt = _start;
  _start = _tmp;
  _tmp = discharge_dt;
  discharge_dt = _dis;
  length_of_stay = discharge_dt - admission_dt;
  claim_id_discharge = _claim_id;
  output;
  discharge_dt = _tmp;
  claim_id_adm = claim_id;
end;
if last.provider_id
then do;
  admission_dt = _start;
  discharge_dt = coalesce(discharge_dt,&def_discharge.);
  length_of_stay = discharge_dt - admission_dt;
  claim_id_discharge = claim_id;
  output;
end;
drop _: claim_id;
run;
Adriaan_Gouws
Obsidian | Level 7

Good day

 

Please see below for an approach. You should be able to leverage the below code. The approach has three steps:

  1. Create a new table with your keys, and derive the length of stay
  2. Create a new table where you isolate the claim IDs, this requires context to your work
  3. Join the claim IDs onto the base file, be careful here to not duplicate, you will notice I've added a few joining keys to ensure the data comes back correctly 
proc sql;
	create table want_base_file as 
		select distinct
			PERSON_ID
			,provider_id
			,admission_dt
			,max(discharge_dt) format=date9. as final_discharge_dt
			,intck('day', admission_dt, calculated final_discharge_dt) as length_of_stay
		from Data_Import_20220802
			group by PERSON_ID
				,provider_id
				,admission_dt;
quit;

proc sql;
	create table want_claim_id_flag as 
		select distinct
			PERSON_ID
			,provider_id
			,admission_dt
			,claim_id
			,(case 
				when claim_id=12345 then 'A' 
				when claim_id=12346 then 'D' 
				else '' end) as claim_flag
		from Data_Import_20220802;
quit;

proc sql;
	create table want_final as 
		select distinct 
			a.*
			,b.claim_id as claim_id_adm
			,c.claim_id as claim_id_discharge
		from want_base_file as a
			left join want_claim_id_flag as b 
				on (a.PERSON_ID=b.PERSON_ID 
					and a.provider_id=b.provider_id 
					and a.admission_dt=b.admission_dt
					and b.claim_flag='A')
			left join want_claim_id_flag as C 
				on (a.PERSON_ID=c.PERSON_ID 
					and a.provider_id=c.provider_id 
					and a.admission_dt=c.admission_dt
					and c.claim_flag='D');
quit;
  1.  
Adriaan_Gouws
Obsidian | Level 7

Good day Kurt

 

Hope you're doing well.

 

From a dynamic perspective, the approach there would be to focus on the context of the work. Ideally, it would be more sustainable to write it slightly different, based on data inputs from users. Below is simply an example.

proc sql;
	create table want_claim_id_flag_dynamic as 
		select distinct
			PERSON_ID
			,provider_id
			,admission_dt
			,claim_id
			,(case 
				when upcase(note) like '%ADMISSION%' then 'A' 
				when upcase(note) like '%DISCHARGE%' then 'D' 
				else '' end) as claim_flag_dynamic
		from Data_Import_20220802;
quit;
lizzy28
Quartz | Level 8
Thanks again, Kurt! This works very well!

Could you explain a bit about the code below? I have difficulty in understanding the part conceptually.
if not first.provider_id and _dis ne . and admission_dt gt sum(_dis,1)
then do;
_tmp = admission_dt;
admission_dt = _start;
_start = _tmp;
_tmp = discharge_dt;
discharge_dt = _dis;
length_of_stay = discharge_dt - admission_dt;
claim_id_discharge = _claim_id;
output;
discharge_dt = _tmp;
claim_id_adm = claim_id;
Kurt_Bremser
Super User
if not first.provider_id and _dis ne . and admission_dt gt sum(_dis,1)
/* this condition becomes true when
- we do not have the first observation of a group (so the previous observation must have been from the same group
- the previous discharge date is not missing (which would indicate a stay without end)
- and the difference between the previous discharge date and the current admission date is greater than 1
the SUM() function is used to prevent "missing value" NOTEs */
then do;
  /* since we recognize the end of a stay after the fact, we need to do some juggling with values */
  _tmp = admission_dt; /* save the current admission date */
  admission_dt = _start; set admission date from the RETAINed value */
  _start = _tmp; /* set the new start from the current admission date */
  _tmp = discharge_dt; /* save the current discharge date */
  discharge_dt = _dis; /* set the discharge date from the previous observation which marks the end of a stay */
  length_of_stay = discharge_dt - admission_dt; /* calculate */
  claim_id_discharge = _claim_id; /* set the claim from the previous observation */
  output; /* write to output */
  discharge_dt = _tmp; /* restore the current value (is necessary if the current observation is the last of a group */
  claim_id_adm = claim_id; /* set the claim for the start of a stay */
end;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 579 views
  • 2 likes
  • 3 in conversation