I would like to find out the hospital transitions of each patient during year 2018. The claims data look like below
(1) The data I have looks like:
PatID FacID clm_from_dt clm_thru_dt
001 40001 1/3/2018 1/6/2018
001 40001 1/14/2018 1/19/2018
001 40001 3/7/2018 3/10/2018
The data I want looks like (need to insert rows to make up the date gap between two claims in order to capture the transition):
PatID FacID clm_from_dt clm_thru_dt
001 40001 1/3/2018 1/6/2018
001 1/6/2018 1/14/2018
001 40001 1/14/2018 1/19/2018
001 1/19/2018 3/7/2018
001 40001 3/7/2018 3/10/2018
001 3/10/2018 12/31/2018
(2) For some patients, they visited hospital multiple times during their stay in a long term care facility. So there is overlapping between the previous claim through date and next claim from date. The data looks like below:
PatID FacID clm_from_dt clm_thru_dt
002 43517 2/1/2018 3/31/2018
002 40002 2/10/2018 2/11/2018
002 42168 2/12/2018 2/22/2018
002 43517 4/1/2018 5/31/2018
002 40002 4/5/2018 4/7/2018
002 43517 6/1/2018 7/31/2018
For this situation, I would like the data looks like below:
PatID FacID clm_from_dt clm_thru_dt
002 43517 2/1/2018 2/10/2018
002 40002 2/10/2018 2/11/2018
002 42168 2/12/2018 2/22/2018
002 43517 2/22/2018 3/31/2018
002 43517 4/1/2018 4/5/2018
002 40002 4/5/2018 4/7/2018
002 43517 4/7/2018 5/31/2018
002 43517 6/1/2018 7/31/2018
002 7/31/2018 12/31/2018
In general, I would like to insert rows to make sure there is no date gap between claims.
Hi @Bobo3
I don't understand why do you still expect overlapping dates in your output table:
e.g. for patient 001:
001 40001 1/3/2018 1/6/2018
001 1/6/2018 1/14/2018 -> why don't you expect 1/7/2018 and 1/13/2018 ?
001 40001 1/14/2018 1/19/2018
001 1/19/2018 3/7/2018 -> why don't you expect 1/20/2018 and 3/6/2018 ?
001 40001 3/7/2018 3/10/2018
001 3/10/2018 12/31/2018 -> why don't you expect 3/11/2018 and 12/31/2018 ?
If this is a mistake, you could try the following program. When FacID is missing, it corresponds to date ranges without hospitalisations.
Hope this help.
data have;
input PatID $ FacID $ clm_from_dt:mmddyy8. clm_thru_dt:mmddyy8.;
format clm_from_dt clm_thru_dt date9.;
cards;
001 40001 1/3/2018 1/6/2018
001 40001 1/14/2018 1/19/2018
001 40001 3/7/2018 3/10/2018
002 43517 2/1/2018 3/31/2018
002 40002 2/10/2018 2/11/2018
002 42168 2/12/2018 2/22/2018
002 43517 4/1/2018 5/31/2018
002 40002 4/5/2018 4/7/2018
002 43517 6/1/2018 7/31/2018
;
run;
/* STEP1 : creation of 1 dataset with 1 row per day of hospitalization */
data day_hosp;
set have;
/* Each hospital transition is identified by the variable source */
source + 1;
/* 1 row per day at hospital */
format day date9.;
day=clm_from_dt - 1;
do i=clm_from_dt to clm_thru_dt;
day + 1;
output;
end;
drop i clm_from_dt clm_thru_dt;
run;
proc sort data=day_hosp;
by patID day;
run;
data day_hosp2;
set day_hosp;
by patID day source;
/* remove overlapping days*/
if last.day then
output;
run;
/* STEP2 : creation of a dataset with 1 row per day until 12/31/2018 */
proc sql;
create table day_all as select patID, min(clm_from_dt) as start_date
format=date9., '31DEC2018'd as end_date format=date9.
from have group by patID order by patID;
quit;
data day_all2;
set day_all;
format day date9.;
by patID;
do day=(start_date) to (end_date);
day+1;
day=day-1;
output;
end;
drop start_date end_date;
run;
/* STEP3 : merge */
data day_all;
format day date9.;
merge day_hosp2 day_all2;
by patID day;
run;
data want;
set day_all;
format clm_from_dt clm_thru_dt date9.;
by patID source notsorted;
retain clm_from_dt;
if first.source then
clm_from_dt=day;
if last.source then
clm_thru_dt=day;
if last.source then
output;
keep PatID FacID clm_from_dt clm_thru_dt;
run;
proc print data=want;
id patID;
run;
Hi @Bobo3
I don't understand why do you still expect overlapping dates in your output table:
e.g. for patient 001:
001 40001 1/3/2018 1/6/2018
001 1/6/2018 1/14/2018 -> why don't you expect 1/7/2018 and 1/13/2018 ?
001 40001 1/14/2018 1/19/2018
001 1/19/2018 3/7/2018 -> why don't you expect 1/20/2018 and 3/6/2018 ?
001 40001 3/7/2018 3/10/2018
001 3/10/2018 12/31/2018 -> why don't you expect 3/11/2018 and 12/31/2018 ?
If this is a mistake, you could try the following program. When FacID is missing, it corresponds to date ranges without hospitalisations.
Hope this help.
data have;
input PatID $ FacID $ clm_from_dt:mmddyy8. clm_thru_dt:mmddyy8.;
format clm_from_dt clm_thru_dt date9.;
cards;
001 40001 1/3/2018 1/6/2018
001 40001 1/14/2018 1/19/2018
001 40001 3/7/2018 3/10/2018
002 43517 2/1/2018 3/31/2018
002 40002 2/10/2018 2/11/2018
002 42168 2/12/2018 2/22/2018
002 43517 4/1/2018 5/31/2018
002 40002 4/5/2018 4/7/2018
002 43517 6/1/2018 7/31/2018
;
run;
/* STEP1 : creation of 1 dataset with 1 row per day of hospitalization */
data day_hosp;
set have;
/* Each hospital transition is identified by the variable source */
source + 1;
/* 1 row per day at hospital */
format day date9.;
day=clm_from_dt - 1;
do i=clm_from_dt to clm_thru_dt;
day + 1;
output;
end;
drop i clm_from_dt clm_thru_dt;
run;
proc sort data=day_hosp;
by patID day;
run;
data day_hosp2;
set day_hosp;
by patID day source;
/* remove overlapping days*/
if last.day then
output;
run;
/* STEP2 : creation of a dataset with 1 row per day until 12/31/2018 */
proc sql;
create table day_all as select patID, min(clm_from_dt) as start_date
format=date9., '31DEC2018'd as end_date format=date9.
from have group by patID order by patID;
quit;
data day_all2;
set day_all;
format day date9.;
by patID;
do day=(start_date) to (end_date);
day+1;
day=day-1;
output;
end;
drop start_date end_date;
run;
/* STEP3 : merge */
data day_all;
format day date9.;
merge day_hosp2 day_all2;
by patID day;
run;
data want;
set day_all;
format clm_from_dt clm_thru_dt date9.;
by patID source notsorted;
retain clm_from_dt;
if first.source then
clm_from_dt=day;
if last.source then
clm_thru_dt=day;
if last.source then
output;
keep PatID FacID clm_from_dt clm_thru_dt;
run;
proc print data=want;
id patID;
run;
To eliminate overlaps and fill in the gaps (if that's what you want) :
data have;
input PatID $ FacID $ clm_from_dt :mmddyy. clm_thru_dt :mmddyy.;
format clm: yymmdd10.;
datalines;
002 43517 2/1/2018 3/31/2018
002 40002 2/10/2018 2/11/2018
002 42168 2/12/2018 2/22/2018
002 43517 4/1/2018 5/31/2018
002 40002 4/5/2018 4/7/2018
002 43517 6/1/2018 7/31/2018
;
data want;
array f {%sysevalf("31dec2017"d, integer) : %sysevalf("01jan2019"d,integer) } $8;
f{lbound(f)} = "xxxx";
f{hbound(f)} = "xxxx";
do until(last.patId);
set have; by patId;
do d = clm_from_dt to clm_thru_dt;
f{d} = FacId;
end;
end;
do d = "01jan2018"d to "31dec2018"d;
if f{d} ne f{d-1} then do;
facId = f{d};
clm_from_dt = d;
end;
if f{d} ne f{d+1} then do;
clm_thru_dt = d;
output;
end;
end;
keep patId facId clm_from_dt clm_thru_dt;
run;
proc print; run;
Obs. PatID FacID clm_from_dt clm_thru_dt 1 002 2018-01-01 2018-01-31 2 002 43517 2018-02-01 2018-02-09 3 002 40002 2018-02-10 2018-02-11 4 002 42168 2018-02-12 2018-02-22 5 002 43517 2018-02-23 2018-04-04 6 002 40002 2018-04-05 2018-04-07 7 002 43517 2018-04-08 2018-07-31 8 002 2018-08-01 2018-12-31
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.