Hi all,
I have two data sets, one is claim level data, and the other is (long-term or inpatient) stay level data. I wanted to construct a new field for the stay level data based on the claim level data. For example, when a diagnosis (flag_dx) showed up in a claim during the claim service dates (clm_beg_dt and clm_end_dt), and if the claim service dates are within the stay time window, the stay should take a value 1. Otherwise, the stay has the value of 0.
The claim level data is as below:
data clmfile;
infile datalines truncover dsd;
input Person_ID clm_beg_dt :mmddyy. clm_end_dt :mmddyy. flag_dx;
format clm_beg_dt clm_end_dt mmddyy.;
datalines;
1,4/17/2017,11/30/2017,0
1,4/17/2017,12/31/2017,0
1,4/17/2017,4/30/2018,0
1,4/17/2017,6/30/2018,0
1,4/17/2017,8/31/2018,0
1,4/17/2017,10/31/2018,0
1,4/17/2017,12/31/2018,0
2,7/17/2017,7/20/2017,0
2,7/20/2017,7/31/2017,0
2,8/1/2017,8/5/2017,0
2,6/7/2018,6/10/2018,0
2,8/2/2018,8/5/2018,1
2,8/5/2018,8/14/2018,1
2,8/14/2018,8/19/2018,1
2,8/19/2018,8/31/2018,1
2,8/19/2018,9/30/2018,1
2,8/19/2018,10/31/2018,1
2,8/19/2018,11/30/2022,1
2,8/19/2018,12/31/2018,1
2,9/1/2018,9/5/2018,1
;
run;
The stay level data file is as below:
data stayfile;
infile datalines truncover dsd;
input Person_ID Stay_ID Stay_from_dt :mmddyy. Stay_Thru_dt :mmddyy.;
format Stay_from_dt Stay_Thru_dt mmddyy.;
datalines;
1,1,4/17/2017,12/31/2018
2,1,7/20/2017,8/5/2017
2,2,6/10/2018,6/29/2018
2,3,8/5/2018,12/31/2018
;
run;
The target data file is as below:
data want;
infile datalines truncover dsd;
input Person_ID Stay_ID Stay_from_dt :mmddyy. Stay_Thru_dt :mmddyy. flag_dx;
format Stay_from_dt Stay_Thru_dt mmddyy.;
datalines;
1,1,4/17/2017,12/31/2018,0
2,1,7/20/2017,8/5/2017,0
2,2,6/10/2018,6/29/2018,0
2,3,8/5/2018,12/31/2018,1
;
run;
Any suggestion is greatly appreciated!
@lichee I see 🙂 Try this instead. Let me know if it works for you.
data want(drop = clm_beg_dt clm_end_dt flag_dx);
if _N_ = 1 then do;
dcl hash h(dataset : 'clmfile', multidata : 'Y');
h.definekey('Person_ID');
h.definedata(all : 'Y');
h.definedone();
end;
set stayfile;
if 0 then set clmfile;
call missing(clm_beg_dt, clm_end_dt, flag_dx);
flag = 0;
do while (h.do_over() = 0);
if clm_beg_dt >= Stay_from_dt
and clm_end_dt <= Stay_Thru_dt
and flag_dx = 1 then do;
flag = 1;
end;
end;
run;
What if one of the observations in the claim data set has flag 0 for just 1 of the 'in-between' obs?
Like the obs below. What if flag was 0 instead?
2,8/19/2018,9/30/2018,1
Ok. And to fully understand: The date interval in the claims data must be completely enclosed by the date interval in stayfile, right? Meaning that clm_beg_dt >= Stay_from_dt and clm_end_dt <= Stay_Thru_dt ?
Ok. Try this
data want(drop = clm_beg_dt clm_end_dt flag_dx);
if _N_ = 1 then do;
dcl hash h(dataset : 'clmfile', multidata : 'Y');
h.definekey('Person_ID');
h.definedata(all : 'Y');
h.definedone();
end;
set stayfile;
if 0 then set clmfile;
call missing(clm_beg_dt, clm_end_dt, flag_dx);
flag = 0;
do while (h.do_over() = 0);
if clm_beg_dt >= Stay_from_dt
and clm_end_dt <= Stay_Thru_dt
and flag_dx = 1 then do;
flag = 1;
leave;
end;
end;
run;
Result:
Person_ID Stay_ID Stay_from_dt Stay_Thru_dt flag 1 1 04/17/17 12/31/18 0 2 1 07/20/17 08/05/17 0 2 2 06/10/18 06/29/18 0 2 3 08/05/18 12/31/18 1
Thank you, Peter!
I tried out the code with real data. When there are Person_IDs with flag_dx=0 following Person_IDs with flag_dx=1, the Person_IDs with flag_dx=0 would be replaced with Person_IDs with flag_dx=1.
For example, when there is third person, Person_ID=3, with flag_dx=0, Person_ID would become Person_ID=2. The data below would have an unexpected output using the code you provided. Any further insight?
data clmfile;
infile datalines truncover dsd;
input Person_ID clm_beg_dt :mmddyy. clm_end_dt :mmddyy. flag_dx;
format clm_beg_dt clm_end_dt mmddyy.;
datalines;
1,4/17/2017,11/30/2017,0
1,4/17/2017,12/31/2017,0
1,4/17/2017,4/30/2018,0
1,4/17/2017,6/30/2018,0
1,4/17/2017,8/31/2018,0
1,4/17/2017,10/31/2018,0
1,4/17/2017,12/31/2018,0
2,7/17/2017,7/20/2017,0
2,7/20/2017,7/31/2017,0
2,8/1/2017,8/5/2017,0
2,6/7/2018,6/10/2018,0
2,8/2/2018,8/5/2018,1
2,8/5/2018,8/14/2018,1
2,8/14/2018,8/19/2018,1
2,8/19/2018,8/31/2018,1
2,8/19/2018,9/30/2018,1
2,8/19/2018,10/31/2018,1
2,8/19/2018,11/30/2022,1
2,8/19/2018,12/31/2018,1
2,9/1/2018,9/5/2018,1
3,12/7/2017,12/12/2017,0
3,12/12/2017,12/31/2017,0
3,1/1/2018,1/16/2018,0
;
run;
data stayfile;
infile datalines truncover dsd;
input Person_ID Stay_ID Stay_from_dt :mmddyy. Stay_Thru_dt :mmddyy.;
format Stay_from_dt Stay_Thru_dt mmddyy.;
datalines;
1,1,4/17/2017,12/31/2018
2,1,7/20/2017,8/5/2017
2,2,6/10/2018,6/29/2018
2,3,8/5/2018,12/31/2018
3,1,12/12/2017,1/16/2018
;
run;
@lichee I see 🙂 Try this instead. Let me know if it works for you.
data want(drop = clm_beg_dt clm_end_dt flag_dx);
if _N_ = 1 then do;
dcl hash h(dataset : 'clmfile', multidata : 'Y');
h.definekey('Person_ID');
h.definedata(all : 'Y');
h.definedone();
end;
set stayfile;
if 0 then set clmfile;
call missing(clm_beg_dt, clm_end_dt, flag_dx);
flag = 0;
do while (h.do_over() = 0);
if clm_beg_dt >= Stay_from_dt
and clm_end_dt <= Stay_Thru_dt
and flag_dx = 1 then do;
flag = 1;
end;
end;
run;
Sure thing. If it works, please mark the code above the accepted answer 🙂
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.