Hi all,
I have two data sets, one is claim level data, and the other is (long-term or inpatient) stay level data. I wanted to construct a new field for the stay level data based on the claim level data. For example, when a diagnosis (flag_dx) showed up in a claim during the claim service dates (clm_beg_dt and clm_end_dt), and if the claim service dates are within the stay time window, the stay should take a value 1. Otherwise, the stay has the value of 0.
The claim level data is as below:
data clmfile;
infile datalines truncover dsd;
input Person_ID clm_beg_dt :mmddyy. clm_end_dt :mmddyy. flag_dx;
format clm_beg_dt clm_end_dt mmddyy.;
datalines;
1,4/17/2017,11/30/2017,0
1,4/17/2017,12/31/2017,0
1,4/17/2017,4/30/2018,0
1,4/17/2017,6/30/2018,0
1,4/17/2017,8/31/2018,0
1,4/17/2017,10/31/2018,0
1,4/17/2017,12/31/2018,0
2,7/17/2017,7/20/2017,0
2,7/20/2017,7/31/2017,0
2,8/1/2017,8/5/2017,0
2,6/7/2018,6/10/2018,0
2,8/2/2018,8/5/2018,1
2,8/5/2018,8/14/2018,1
2,8/14/2018,8/19/2018,1
2,8/19/2018,8/31/2018,1
2,8/19/2018,9/30/2018,1
2,8/19/2018,10/31/2018,1
2,8/19/2018,11/30/2022,1
2,8/19/2018,12/31/2018,1
2,9/1/2018,9/5/2018,1
;
run;
The stay level data file is as below:
data stayfile;
infile datalines truncover dsd;
input Person_ID Stay_ID Stay_from_dt :mmddyy. Stay_Thru_dt :mmddyy.;
format Stay_from_dt Stay_Thru_dt mmddyy.;
datalines;
1,1,4/17/2017,12/31/2018
2,1,7/20/2017,8/5/2017
2,2,6/10/2018,6/29/2018
2,3,8/5/2018,12/31/2018
;
run;
The target data file is as below:
data want;
infile datalines truncover dsd;
input Person_ID Stay_ID Stay_from_dt :mmddyy. Stay_Thru_dt :mmddyy. flag_dx;
format Stay_from_dt Stay_Thru_dt mmddyy.;
datalines;
1,1,4/17/2017,12/31/2018,0
2,1,7/20/2017,8/5/2017,0
2,2,6/10/2018,6/29/2018,0
2,3,8/5/2018,12/31/2018,1
;
run;
Any suggestion is greatly appreciated!
@lichee I see 🙂 Try this instead. Let me know if it works for you.
data want(drop = clm_beg_dt clm_end_dt flag_dx);
if _N_ = 1 then do;
dcl hash h(dataset : 'clmfile', multidata : 'Y');
h.definekey('Person_ID');
h.definedata(all : 'Y');
h.definedone();
end;
set stayfile;
if 0 then set clmfile;
call missing(clm_beg_dt, clm_end_dt, flag_dx);
flag = 0;
do while (h.do_over() = 0);
if clm_beg_dt >= Stay_from_dt
and clm_end_dt <= Stay_Thru_dt
and flag_dx = 1 then do;
flag = 1;
end;
end;
run;
What if one of the observations in the claim data set has flag 0 for just 1 of the 'in-between' obs?
Like the obs below. What if flag was 0 instead?
2,8/19/2018,9/30/2018,1
Ok. And to fully understand: The date interval in the claims data must be completely enclosed by the date interval in stayfile, right? Meaning that clm_beg_dt >= Stay_from_dt and clm_end_dt <= Stay_Thru_dt ?
Ok. Try this
data want(drop = clm_beg_dt clm_end_dt flag_dx);
if _N_ = 1 then do;
dcl hash h(dataset : 'clmfile', multidata : 'Y');
h.definekey('Person_ID');
h.definedata(all : 'Y');
h.definedone();
end;
set stayfile;
if 0 then set clmfile;
call missing(clm_beg_dt, clm_end_dt, flag_dx);
flag = 0;
do while (h.do_over() = 0);
if clm_beg_dt >= Stay_from_dt
and clm_end_dt <= Stay_Thru_dt
and flag_dx = 1 then do;
flag = 1;
leave;
end;
end;
run;
Result:
Person_ID Stay_ID Stay_from_dt Stay_Thru_dt flag 1 1 04/17/17 12/31/18 0 2 1 07/20/17 08/05/17 0 2 2 06/10/18 06/29/18 0 2 3 08/05/18 12/31/18 1
Thank you, Peter!
I tried out the code with real data. When there are Person_IDs with flag_dx=0 following Person_IDs with flag_dx=1, the Person_IDs with flag_dx=0 would be replaced with Person_IDs with flag_dx=1.
For example, when there is third person, Person_ID=3, with flag_dx=0, Person_ID would become Person_ID=2. The data below would have an unexpected output using the code you provided. Any further insight?
data clmfile;
infile datalines truncover dsd;
input Person_ID clm_beg_dt :mmddyy. clm_end_dt :mmddyy. flag_dx;
format clm_beg_dt clm_end_dt mmddyy.;
datalines;
1,4/17/2017,11/30/2017,0
1,4/17/2017,12/31/2017,0
1,4/17/2017,4/30/2018,0
1,4/17/2017,6/30/2018,0
1,4/17/2017,8/31/2018,0
1,4/17/2017,10/31/2018,0
1,4/17/2017,12/31/2018,0
2,7/17/2017,7/20/2017,0
2,7/20/2017,7/31/2017,0
2,8/1/2017,8/5/2017,0
2,6/7/2018,6/10/2018,0
2,8/2/2018,8/5/2018,1
2,8/5/2018,8/14/2018,1
2,8/14/2018,8/19/2018,1
2,8/19/2018,8/31/2018,1
2,8/19/2018,9/30/2018,1
2,8/19/2018,10/31/2018,1
2,8/19/2018,11/30/2022,1
2,8/19/2018,12/31/2018,1
2,9/1/2018,9/5/2018,1
3,12/7/2017,12/12/2017,0
3,12/12/2017,12/31/2017,0
3,1/1/2018,1/16/2018,0
;
run;
data stayfile;
infile datalines truncover dsd;
input Person_ID Stay_ID Stay_from_dt :mmddyy. Stay_Thru_dt :mmddyy.;
format Stay_from_dt Stay_Thru_dt mmddyy.;
datalines;
1,1,4/17/2017,12/31/2018
2,1,7/20/2017,8/5/2017
2,2,6/10/2018,6/29/2018
2,3,8/5/2018,12/31/2018
3,1,12/12/2017,1/16/2018
;
run;
@lichee I see 🙂 Try this instead. Let me know if it works for you.
data want(drop = clm_beg_dt clm_end_dt flag_dx);
if _N_ = 1 then do;
dcl hash h(dataset : 'clmfile', multidata : 'Y');
h.definekey('Person_ID');
h.definedata(all : 'Y');
h.definedone();
end;
set stayfile;
if 0 then set clmfile;
call missing(clm_beg_dt, clm_end_dt, flag_dx);
flag = 0;
do while (h.do_over() = 0);
if clm_beg_dt >= Stay_from_dt
and clm_end_dt <= Stay_Thru_dt
and flag_dx = 1 then do;
flag = 1;
end;
end;
run;
Sure thing. If it works, please mark the code above the accepted answer 🙂
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.