I previously asked how to construct inpatient stays and length of stay here: https://communities.sas.com/t5/SAS-Programming/Count-number-of-inpatient-stays-and-length-of-stay/m-....
On top of that, I'd like to obtain a bit more admission and discharge information per stay. Say, each encounter has a individual claim_id. For each inpatient stay, I wanted to keep the admission claim_id (claim_id when a person was admitted) and the discharge claim_id (claim_id when a person was discharged). The example data look like below (only column claim_id is added, the rest is the same as the old post).
| PERSON_ID | provider_id | admission_dt | discharge_dt | claim_id | Note | 
| 1 | A | 8/1/2014 | 1/7/2018 | 12345 | a stay starts (admission_dt) from previous years, 1/1/2018 will be used as initial admission_dt | 
| 1 | A | 8/1/2014 | 1/10/2018 | 12346 | Overlapped stay, so it's considered as the same stay as the stay in last claim (last observation) | 
| 1 | A | 1/11/2018 | 1/31/2018 | 12347 | Consecutive date (date difference<=1) from last observation, so the same inpatient stay | 
| 1 | A | 1/11/2018 | 2/28/2018 | 12348 | |
| 1 | A | 1/11/2018 | 2/28/2018 | 12349 | |
| 1 | A | 1/11/2018 | 3/31/2018 | 12350 | |
| 1 | A | 1/11/2018 | 4/30/2018 | 12351 | |
| 1 | A | 5/16/2018 | 5/31/2018 | 12352 | Non-consecutive date from last admission, so a separate inpatient stay | 
| 1 | A | 5/16/2018 | 6/30/2018 | 12353 | |
| 1 | A | 5/16/2018 | 7/31/2018 | 12354 | |
| 1 | A | 5/16/2018 | 8/31/2018 | 12355 | |
| 1 | A | 5/16/2018 | 9/30/2018 | 12356 | |
| 1 | A | 5/16/2018 | 10/31/2018 | 12357 | |
| 1 | A | 5/16/2018 | 11/30/2018 | 12358 | |
| 1 | A | 5/16/2018 | 12/31/2018 | 12358 | |
| 2 | B | 11/1/2017 | 1/31/2018 | 21234 | |
| 2 | B | 12/24/2017 | 1/1/2018 | 21235 | |
| 2 | B | 11/1/2017 | 2/28/2018 | 21236 | |
| 2 | B | 11/1/2017 | 3/31/2018 | 21237 | |
| 2 | B | 11/1/2017 | 4/30/2018 | 21238 | |
| 2 | B | 11/1/2017 | 5/31/2018 | 21239 | |
| 2 | B | 11/1/2017 | 6/30/2018 | 21240 | |
| 2 | B | 11/1/2017 | 7/31/2018 | 21241 | |
| 2 | B | 11/1/2017 | 8/31/2018 | 21242 | |
| 2 | B | 11/1/2017 | 9/30/2018 | 21243 | |
| 2 | B | 11/1/2017 | 10/5/2018 | 21244 | |
| 2 | C | 10/6/2018 | 10/31/2018 | 21245 | Consecutive date from last admission but with different provider, so a separate inpatient stay | 
| 2 | C | 10/5/2018 | 11/30/2018 | 21246 | |
| 2 | C | 10/5/2018 | 12/31/2018 | 21247 | |
| 3 | D | 1/27/2018 | 1/30/2018 | 31234 | |
| 3 | D | 1/27/2018 | 2/28/2018 | 31235 | |
| 3 | E | 3/15/2018 | 31236 | A new inpatient stay at a different provider 15 days after the last stay | |
| 3 | E | 3/15/2018 | 31237 | 
 Missing discharge date means the person is not discharged per claim, so 12/31/2018 will be used as the discharge date for the stay  | 
|
| 
 
 3  | 
E | 3/15/2018 | 31238 | ||
| 3 | E | 3/15/2018 | 31239 | ||
| 3 | E | 3/15/2018 | 31240 | ||
| 3 | E | 3/15/2018 | 31241 | ||
| 3 | E | 4/1/2018 | 31242 | ||
| 3 | E | 5/1/2018 | 31243 | ||
| 3 | E | 6/1/2018 | 31244 | ||
| 3 | E | 7/1/2018 | 31245 | ||
| 3 | E | 8/1/2018 | 31246 | ||
| 3 | E | 9/1/2018 | 31247 | ||
| 3 | E | 10/1/2018 | 31248 | ||
| 3 | E | 11/1/2018 | 31249 | ||
| 3 | E | 12/1/2018 | 31250 | 
The target output would be
| PERSON_ID | provider_id | admission_dt | discharge_dt | length_of_stay | claim_id_adm | claim_id_discharge | 
| 1 | A | 8/1/2014 | 4/30/2018 | 1368 | 12345 | 12351 | 
| 1 | A | 5/16/2018 | 12/31/2018 | 229 | 12352 | 12358 | 
| 2 | B | 11/1/2017 | 10/5/2018 | 338 | 21234 | 21244 | 
| 2 | C | 10/5/2018 | 12/31/2018 | 87 | 21246 | 21247 | 
| 3 | D | 1/27/2018 | 2/28/2018 | 32 | 31234 | 31235 | 
| 3 | E | 3/15/2018 | 12/31/2018 | 291 | 31236 | 31250 | 
where the last two columns are newly targeted column compared to the old post.
Much appreciated!
Please do yourself and us a BIG favor and always include your data as a data step with datalines, as I have already shown you. Not having to write that speed up the process of finding a solution, and it prevents any questions about variable attributes (lengths, formats) and real content.
You need only a slight expansion of the original logic, as the "cutover point" for the new variables is the same.
data have;
infile datalines dlm="09"x dsd truncover;
input
  person_id $
  provider_id $
  admission_dt :mmddyy10.
  discharge_dt :mmddyy10.
  claim_id :$5.
;
format
  admission_dt
  discharge_dt yymmdd10.
;
datalines;
1	A	8/1/2014	1/7/2018	12345
1	A	8/1/2014	1/10/2018	12346
1	A	1/11/2018	1/31/2018	12347
1	A	1/11/2018	2/28/2018	12348
1	A	1/11/2018	2/28/2018	12349
1	A	1/11/2018	3/31/2018	12350
1	A	1/11/2018	4/30/2018	12351
1	A	5/16/2018	5/31/2018	12352
1	A	5/16/2018	6/30/2018	12353
1	A	5/16/2018	7/31/2018	12354	 
1	A	5/16/2018	8/31/2018	12355	 
1	A	5/16/2018	9/30/2018	12356	 
1	A	5/16/2018	10/31/2018	12357	 
1	A	5/16/2018	11/30/2018	12358	 
1	A	5/16/2018	12/31/2018	12358	 
2	B	11/1/2017	1/31/2018	21234	 
2	B	12/24/2017	1/1/2018	21235	 
2	B	11/1/2017	2/28/2018	21236	 
2	B	11/1/2017	3/31/2018	21237	 
2	B	11/1/2017	4/30/2018	21238	 
2	B	11/1/2017	5/31/2018	21239	 
2	B	11/1/2017	6/30/2018	21240	 
2	B	11/1/2017	7/31/2018	21241	 
2	B	11/1/2017	8/31/2018	21242	 
2	B	11/1/2017	9/30/2018	21243	 
2	B	11/1/2017	10/5/2018	21244	 
2	C	10/6/2018	10/31/2018	21245	
2	C	10/5/2018	11/30/2018	21246	 
2	C	10/5/2018	12/31/2018	21247	 
3	D	1/27/2018	1/30/2018	31234	 
3	D	1/27/2018	2/28/2018	31235	 
3	E	3/15/2018	 	31236	
3	E	3/15/2018	 	31237	 
3	E	3/15/2018	 	31238
3	E	3/15/2018	 	31239
3	E	3/15/2018	 	31240
3	E	3/15/2018	 	31241
3	E	4/1/2018	 	31242
3	E	5/1/2018	 	31243
3	E	6/1/2018	 	31244
3	E	7/1/2018	 	31245
3	E	8/1/2018	 	31246
3	E	9/1/2018	 	31247
3	E	10/1/2018	 	31248
3	E	11/1/2018	 	31249
3	E	12/1/2018	 	31250
;
%let def_discharge = %sysfunc(inputn(2018-12-31,yymmdd10.));
data want;
set have;
by person_id provider_id;
retain _start length_of_stay claim_id_adm;
/* length_of stay is included here solely for the purpose of variable order, purely cosmetic! */
_dis = lag(discharge_dt);
_claim_id = lag(claim_id);
if first.provider_id
then do;
  _start = admission_dt;
  claim_id_adm = claim_id;
end;
if not first.provider_id and _dis ne . and admission_dt gt sum(_dis,1)
then do;
  _tmp = admission_dt;
  admission_dt = _start;
  _start = _tmp;
  _tmp = discharge_dt;
  discharge_dt = _dis;
  length_of_stay = discharge_dt - admission_dt;
  claim_id_discharge = _claim_id;
  output;
  discharge_dt = _tmp;
  claim_id_adm = claim_id;
end;
if last.provider_id
then do;
  admission_dt = _start;
  discharge_dt = coalesce(discharge_dt,&def_discharge.);
  length_of_stay = discharge_dt - admission_dt;
  claim_id_discharge = claim_id;
  output;
end;
drop _: claim_id;
run;
					
				
			
			
				
			
			
			
				
			
			
			
			
			
		Please do yourself and us a BIG favor and always include your data as a data step with datalines, as I have already shown you. Not having to write that speed up the process of finding a solution, and it prevents any questions about variable attributes (lengths, formats) and real content.
You need only a slight expansion of the original logic, as the "cutover point" for the new variables is the same.
data have;
infile datalines dlm="09"x dsd truncover;
input
  person_id $
  provider_id $
  admission_dt :mmddyy10.
  discharge_dt :mmddyy10.
  claim_id :$5.
;
format
  admission_dt
  discharge_dt yymmdd10.
;
datalines;
1	A	8/1/2014	1/7/2018	12345
1	A	8/1/2014	1/10/2018	12346
1	A	1/11/2018	1/31/2018	12347
1	A	1/11/2018	2/28/2018	12348
1	A	1/11/2018	2/28/2018	12349
1	A	1/11/2018	3/31/2018	12350
1	A	1/11/2018	4/30/2018	12351
1	A	5/16/2018	5/31/2018	12352
1	A	5/16/2018	6/30/2018	12353
1	A	5/16/2018	7/31/2018	12354	 
1	A	5/16/2018	8/31/2018	12355	 
1	A	5/16/2018	9/30/2018	12356	 
1	A	5/16/2018	10/31/2018	12357	 
1	A	5/16/2018	11/30/2018	12358	 
1	A	5/16/2018	12/31/2018	12358	 
2	B	11/1/2017	1/31/2018	21234	 
2	B	12/24/2017	1/1/2018	21235	 
2	B	11/1/2017	2/28/2018	21236	 
2	B	11/1/2017	3/31/2018	21237	 
2	B	11/1/2017	4/30/2018	21238	 
2	B	11/1/2017	5/31/2018	21239	 
2	B	11/1/2017	6/30/2018	21240	 
2	B	11/1/2017	7/31/2018	21241	 
2	B	11/1/2017	8/31/2018	21242	 
2	B	11/1/2017	9/30/2018	21243	 
2	B	11/1/2017	10/5/2018	21244	 
2	C	10/6/2018	10/31/2018	21245	
2	C	10/5/2018	11/30/2018	21246	 
2	C	10/5/2018	12/31/2018	21247	 
3	D	1/27/2018	1/30/2018	31234	 
3	D	1/27/2018	2/28/2018	31235	 
3	E	3/15/2018	 	31236	
3	E	3/15/2018	 	31237	 
3	E	3/15/2018	 	31238
3	E	3/15/2018	 	31239
3	E	3/15/2018	 	31240
3	E	3/15/2018	 	31241
3	E	4/1/2018	 	31242
3	E	5/1/2018	 	31243
3	E	6/1/2018	 	31244
3	E	7/1/2018	 	31245
3	E	8/1/2018	 	31246
3	E	9/1/2018	 	31247
3	E	10/1/2018	 	31248
3	E	11/1/2018	 	31249
3	E	12/1/2018	 	31250
;
%let def_discharge = %sysfunc(inputn(2018-12-31,yymmdd10.));
data want;
set have;
by person_id provider_id;
retain _start length_of_stay claim_id_adm;
/* length_of stay is included here solely for the purpose of variable order, purely cosmetic! */
_dis = lag(discharge_dt);
_claim_id = lag(claim_id);
if first.provider_id
then do;
  _start = admission_dt;
  claim_id_adm = claim_id;
end;
if not first.provider_id and _dis ne . and admission_dt gt sum(_dis,1)
then do;
  _tmp = admission_dt;
  admission_dt = _start;
  _start = _tmp;
  _tmp = discharge_dt;
  discharge_dt = _dis;
  length_of_stay = discharge_dt - admission_dt;
  claim_id_discharge = _claim_id;
  output;
  discharge_dt = _tmp;
  claim_id_adm = claim_id;
end;
if last.provider_id
then do;
  admission_dt = _start;
  discharge_dt = coalesce(discharge_dt,&def_discharge.);
  length_of_stay = discharge_dt - admission_dt;
  claim_id_discharge = claim_id;
  output;
end;
drop _: claim_id;
run;
					
				
			
			
				
			
			
			
			
			
			
			
		Good day
Please see below for an approach. You should be able to leverage the below code. The approach has three steps:
proc sql;
	create table want_base_file as 
		select distinct
			PERSON_ID
			,provider_id
			,admission_dt
			,max(discharge_dt) format=date9. as final_discharge_dt
			,intck('day', admission_dt, calculated final_discharge_dt) as length_of_stay
		from Data_Import_20220802
			group by PERSON_ID
				,provider_id
				,admission_dt;
quit;
proc sql;
	create table want_claim_id_flag as 
		select distinct
			PERSON_ID
			,provider_id
			,admission_dt
			,claim_id
			,(case 
				when claim_id=12345 then 'A' 
				when claim_id=12346 then 'D' 
				else '' end) as claim_flag
		from Data_Import_20220802;
quit;
proc sql;
	create table want_final as 
		select distinct 
			a.*
			,b.claim_id as claim_id_adm
			,c.claim_id as claim_id_discharge
		from want_base_file as a
			left join want_claim_id_flag as b 
				on (a.PERSON_ID=b.PERSON_ID 
					and a.provider_id=b.provider_id 
					and a.admission_dt=b.admission_dt
					and b.claim_flag='A')
			left join want_claim_id_flag as C 
				on (a.PERSON_ID=c.PERSON_ID 
					and a.provider_id=c.provider_id 
					and a.admission_dt=c.admission_dt
					and c.claim_flag='D');
quit;How do you plan to make this
			,(case 
				when claim_id=12345 then 'A' 
				when claim_id=12346 then 'D' 
				else '' end) as claim_flag
dynamic to work for all stays?
Good day Kurt
Hope you're doing well.
From a dynamic perspective, the approach there would be to focus on the context of the work. Ideally, it would be more sustainable to write it slightly different, based on data inputs from users. Below is simply an example.
proc sql;
	create table want_claim_id_flag_dynamic as 
		select distinct
			PERSON_ID
			,provider_id
			,admission_dt
			,claim_id
			,(case 
				when upcase(note) like '%ADMISSION%' then 'A' 
				when upcase(note) like '%DISCHARGE%' then 'D' 
				else '' end) as claim_flag_dynamic
		from Data_Import_20220802;
quit;
					
				
			
			
				
			
			
			
			
			
			
			
		if not first.provider_id and _dis ne . and admission_dt gt sum(_dis,1)
/* this condition becomes true when
- we do not have the first observation of a group (so the previous observation must have been from the same group
- the previous discharge date is not missing (which would indicate a stay without end)
- and the difference between the previous discharge date and the current admission date is greater than 1
the SUM() function is used to prevent "missing value" NOTEs */
then do;
  /* since we recognize the end of a stay after the fact, we need to do some juggling with values */
  _tmp = admission_dt; /* save the current admission date */
  admission_dt = _start; set admission date from the RETAINed value */
  _start = _tmp; /* set the new start from the current admission date */
  _tmp = discharge_dt; /* save the current discharge date */
  discharge_dt = _dis; /* set the discharge date from the previous observation which marks the end of a stay */
  length_of_stay = discharge_dt - admission_dt; /* calculate */
  claim_id_discharge = _claim_id; /* set the claim from the previous observation */
  output; /* write to output */
  discharge_dt = _tmp; /* restore the current value (is necessary if the current observation is the last of a group */
  claim_id_adm = claim_id; /* set the claim for the start of a stay */
end;
					
				
			
			
				
			
			
			
			
			
			
			
		It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.