Hi folks,
My data has
- Date pairs (elig_beg and elig_end) for varying length of time intervals to mark insurance coverage periods. Everybody in this dataset has insurance, but, on and off basis. For example: 01/05/2005(elig_beg1)-01/08/2005(elig_end1) and 01/01/2007(elig_beg2)-01/01/2010(elig_end2). This patient had two coverage intervals. First coverage lasted 3 months and the last coverage lasted 3 years.
- Date_diagnosis : date of diagnosis of a disease
- ID: unique patient identifier
- Eligibility data in separate columns (elig_beg1 thru 50 and elig_end1 thru 50) and unique for each patients.
My goal is to output the date pair that included date_diagnosis as shown in the image below for the first row of "have" demo data.
For instance: 10DEC2004 fell in the date pair: elig_beg/elig_end1, thus, desired output is two variables such as "cov_beg/cov_end". If I identify this date pair (cov_beg and cov-end) that included the date of diagnosis I will be able to calculate the length of the coverage period of insurance that included date_diagnosis in it.
DATA HAVE;
INPUT ID DATE_DIAGNOSIS elig_beg1 elig_end1 elig_beg2 elig_end2;
CARDS;
1 16415 16376 17713 17806 18808
2 16419 16376 17836 . .
3 16428 16376 17836 17867 20819
4 16429 16376 16617 . .
5 16771 16741 20850 . .
6 16792 16741 20819 . .
7 17153 17106 17286 . .
8 17157 17106 17256 . .
9 18235 18202 18352 . .
10 18234 18202 20819 21124 21184
11 18597 18567 18839 19663 20027
12 18624 18567 18992 . .
13 18605 18567 18961 19175 19205
14 18974 18932 19144 . .
15 18963 18932 19631 . .
16 18991 18932 19052 . .
17 19339 19298 21184 . .
18 19353 19298 19389 . .
19 19331 19298 19631 . .
20 19716 19663 20819 . .
21 19345 19298 19601 . .
22 20075 20028 20819 20851 20909
23 20065 20028 20361 . .
;
Hi @Cruise
I think it can be done in the first step with a slight modification of the loop, thus eliminating the intermediate data set and the 3 following steps. It seems that all diagnosis dates fall within the first interval in your example data, so I took the liberty to change a couple of diagnosis dates to test the code.
DATA have;
format id 8. DATE_DIAGNOSIS elig_beg1 elig_end1 elig_beg2 elig_end2 date9.;
informat DATE_DIAGNOSIS elig_beg1 elig_end1 elig_beg2 elig_end2 date9.;
input ID DATE_DIAGNOSIS elig_beg1 elig_end1 elig_beg2 elig_end2;
cards;
1 10DEC2004 01NOV2004 30JUN2008 01OCT2008 30JUN2011
2 14DEC2004 01NOV2004 31OCT2008 . .
3 01DEC2008 01NOV2004 31OCT2008 01DEC2008 31DEC2016
4 24DEC2004 01NOV2004 30JUN2005 . .
5 01DEC2004 01NOV2005 31JAN2017 . .
6 22DEC2005 01NOV2005 31DEC2016 . .
7 18DEC2006 01NOV2006 30APR2007 . .
8 22DEC2006 01NOV2006 31MAR2007 . .
9 04DEC2009 01NOV2009 31MAR2010 . .
10 03DEC2009 01NOV2009 31DEC2016 01NOV2017 31DEC2017
11 01DEC2010 01NOV2010 31JUL2011 01NOV2013 31OCT2014
12 28DEC2010 01NOV2010 31DEC2011 . .
13 09DEC2010 01NOV2010 30NOV2011 01JUL2012 31JUL2012
14 13DEC2011 01NOV2011 31MAY2012 . .
15 02DEC2011 01NOV2011 30SEP2013 . .
16 30DEC2011 01NOV2011 29FEB2012 . .
17 12DEC2012 01NOV2012 31DEC2017 . .
18 26DEC2012 01NOV2012 31JAN2013 . .
19 04DEC2012 01NOV2012 30SEP2013 . .
20 24DEC2013 01NOV2013 31DEC2016 . .
21 18DEC2012 01NOV2012 31AUG2013 . .
22 18DEC2014 01NOV2014 31DEC2016 01FEB2017 31MAR2017
23 08DEC2014 01NOV2014 30SEP2015 . .
;
run;
data want (drop=i); set have;
array beg {*} elig_beg:;
array end {*} elig_end:;
format cov_beg cov_end date9.;
do i = 1 to dim(beg);
if missing(beg{i}) then leave;
if beg{i} <= DATE_DIAGNOSIS <= end{i} then do;
cov_beg = beg{i};
cov_end = end{i};
cov_days = cov_end - cov_beg + 1;
leave;
end;
end;
run;
What happens in leap years?
"Approximate rounding" isn't a sufficient definition to code from. To be more precise you could ignore 29 Feb if it falls within any of your date pairs.
I think, I solved the problem using Patrick's solution provided in the earlier forum. @Patrick
data P.long(drop=_: elig_beg: elig_end:); /*NROW=19,931*/ set MY_ACTUAL_DATA; /*NROW=19,931, N_SUBJ=10,341*/
array _elig_beg {*} elig_beg:;
array _end {*} elig_end:;
attrib insurance_elig_beg_dt insurance_end_dt format=date9.;
do _i=1 to dim(_elig_beg);
if missing(_elig_beg[_i]) then continue;
insurance_elig_beg_dt=_elig_beg[_i];
insurance_end_dt=_end[_i];
output;
end;
run;
DATA P.long1; SET P.long; /*NROW=19,931, N_SUBJ=10,341*/
IF insurance_elig_beg_dt<=DATE_DIAGNOSIS<=insurance_end_dt THEN COV=1; ELSE COV=0;
RUN;
Proc sort data=P.long1;
By DISPLAY_ID COV;
Run;
Data P.long2; Set P.long1;
By DISPLAY_ID;
If last.DISPLAY_ID;
Run;
Hi @Cruise
I think it can be done in the first step with a slight modification of the loop, thus eliminating the intermediate data set and the 3 following steps. It seems that all diagnosis dates fall within the first interval in your example data, so I took the liberty to change a couple of diagnosis dates to test the code.
DATA have;
format id 8. DATE_DIAGNOSIS elig_beg1 elig_end1 elig_beg2 elig_end2 date9.;
informat DATE_DIAGNOSIS elig_beg1 elig_end1 elig_beg2 elig_end2 date9.;
input ID DATE_DIAGNOSIS elig_beg1 elig_end1 elig_beg2 elig_end2;
cards;
1 10DEC2004 01NOV2004 30JUN2008 01OCT2008 30JUN2011
2 14DEC2004 01NOV2004 31OCT2008 . .
3 01DEC2008 01NOV2004 31OCT2008 01DEC2008 31DEC2016
4 24DEC2004 01NOV2004 30JUN2005 . .
5 01DEC2004 01NOV2005 31JAN2017 . .
6 22DEC2005 01NOV2005 31DEC2016 . .
7 18DEC2006 01NOV2006 30APR2007 . .
8 22DEC2006 01NOV2006 31MAR2007 . .
9 04DEC2009 01NOV2009 31MAR2010 . .
10 03DEC2009 01NOV2009 31DEC2016 01NOV2017 31DEC2017
11 01DEC2010 01NOV2010 31JUL2011 01NOV2013 31OCT2014
12 28DEC2010 01NOV2010 31DEC2011 . .
13 09DEC2010 01NOV2010 30NOV2011 01JUL2012 31JUL2012
14 13DEC2011 01NOV2011 31MAY2012 . .
15 02DEC2011 01NOV2011 30SEP2013 . .
16 30DEC2011 01NOV2011 29FEB2012 . .
17 12DEC2012 01NOV2012 31DEC2017 . .
18 26DEC2012 01NOV2012 31JAN2013 . .
19 04DEC2012 01NOV2012 30SEP2013 . .
20 24DEC2013 01NOV2013 31DEC2016 . .
21 18DEC2012 01NOV2012 31AUG2013 . .
22 18DEC2014 01NOV2014 31DEC2016 01FEB2017 31MAR2017
23 08DEC2014 01NOV2014 30SEP2015 . .
;
run;
data want (drop=i); set have;
array beg {*} elig_beg:;
array end {*} elig_end:;
format cov_beg cov_end date9.;
do i = 1 to dim(beg);
if missing(beg{i}) then leave;
if beg{i} <= DATE_DIAGNOSIS <= end{i} then do;
cov_beg = beg{i};
cov_end = end{i};
cov_days = cov_end - cov_beg + 1;
leave;
end;
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.