Hi Folks,
I'm trying to subset data where patients had insurance in the same year the diagnosis was made and in its border months.
Desired output is the variable in the data called "MARK" (0,1) :
1. mark=1 if the year of date_diagnosis falls in any date pairs (elig_beg: - elig_end); or
2. mark=1 if any date pairs (elig_beg: - elig_end) include three month before and after the year(date_diagnosis) to include border month (October thru December and January thru March) in the analysis;
else MARK=0;
Current code checks whether patients ever enrolled in an insurance in the year of diagnosis. With this rule, three observations are not marked. However, if you include border months then all observations will take MARK=1. Please see mock data posted.
Would you please help me modify the code to check the coverage in border months as explained?
Thanks in advance, I greatly appreciate your time indeed.
See below Tom's solution to mark insurance in the year of diagnosis.
DATA HAVE;
format id 8. elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 date9. MARK 1. YEAR 4.;
informat elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 date9. MARK 1. YEAR 4.;
input ID elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 MARK YEAR;
cards;
1 1-Nov-03 30-Nov-03 1-Jan-04 31-Jul-04 1-Oct-04 31-Dec-04 1-Jun-05 31-Jul-05 1-Oct-05 31-Jan-06 1-Feb-11 30-Jun-11 1-Jan-14 28-Feb-14 1-Jan-15 31-Jan-17 1 2005
2 1-Jan-02 31-Jan-02 1-Dec-05 28-Feb-13 1-Dec-16 31-Dec-16 1-Oct-17 30-Nov-17 . . . . . . . . 1 2005
3 1-Jan-02 31-Jan-02 1-Aug-02 30-Sep-02 1-Nov-04 31-Dec-04 1-Jul-11 31-Aug-11 . . . . . . . . 1 2005
4 1-Aug-05 31-Dec-06 1-Dec-09 31-Dec-10 . . . . . . . . . . . . 1 2005
5 1-May-02 31-May-02 1-Mar-03 31-May-03 1-Oct-03 30-Nov-03 1-Jan-04 29-Feb-04 1-Jan-04 30-Apr-04 1-Apr-06 31-May-06 1-Oct-06 30-Nov-06 . . 1 2005
6 1-Mar-05 31-Mar-05 1-Feb-06 28-Feb-06 . . . . . . . . . . . . 1 2005
7 1-Mar-05 31-May-05 1-Jan-06 28-Feb-06 1-Feb-07 28-Feb-07 . . . . . . . . . . 1 2005
8 1-Jun-05 30-Jun-05 1-Aug-05 30-Sep-05 1-May-06 30-Jun-06 1-Nov-09 31-Dec-09 1-May-11 30-Nov-11 1-Jan-13 28-Feb-13 1-Dec-13 30-Nov-14 1-Mar-15 31-May-15 1 2005
9 1-Jun-04 30-Jun-04 1-Sep-04 31-Oct-04 1-May-05 30-Jun-05 1-Sep-06 30-Nov-06 1-Feb-07 30-Nov-07 1-Nov-08 31-Dec-08 1-Apr-09 30-Apr-09 . . 1 2005
10 1-May-05 31-Jul-05 1-Apr-06 31-May-06 1-Apr-09 31-May-09 1-Apr-10 30-Apr-10 1-Apr-11 30-Jun-11 1-Mar-12 30-Apr-12 1-Mar-13 31-Mar-13 1-Dec-16 31-Mar-17 1 2018
;
data want;
set have;
array e_beg elig_beg: ;
array e_end elig_end: ;
do i=1 to dim(e_beg) until(enrolled);
enrolled = year(e_beg[i]) <= year <= year(e_end[i]);
end;
drop i;
run;
proc print; run;
https://communities.sas.com/t5/SAS-Communities-Library/Working-with-Dates-and-Times-in-SAS-Tutorial/... has a PDF with much information about dates.
Hey Cruise,
I believe I understand what you are asking for in your post and came up with this logic:
DATA Have;
FORMAT id 8. elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 date9. YEAR 4.;
INFORMAT elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 date9. YEAR 4.;
INPUT ID elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 YEAR;
CARDS;
1 1-Nov-03 30-Nov-03 1-Jan-04 31-Jul-04 1-Oct-04 31-Dec-04 1-Jun-05 31-Jul-05 1-Oct-05 31-Jan-06 1-Feb-11 30-Jun-11 1-Jan-14 28-Feb-14 1-Jan-15 31-Jan-17 2005
2 1-Jan-02 31-Jan-02 1-Dec-05 28-Feb-13 1-Dec-16 31-Dec-16 1-Oct-17 30-Nov-17 . . . . . . . . 2005
3 1-Jan-02 31-Jan-02 1-Aug-02 30-Sep-02 1-Nov-04 31-Dec-04 1-Jul-11 31-Aug-11 . . . . . . . . 2005
4 1-Aug-05 31-Dec-06 1-Dec-09 31-Dec-10 . . . . . . . . . . . . 2005
5 1-May-02 31-May-02 1-Mar-03 31-May-03 1-Oct-03 30-Nov-03 1-Jan-04 29-Feb-04 1-Jan-04 30-Apr-04 1-Apr-06 31-May-06 1-Oct-06 30-Nov-06 . . 2005
6 1-Mar-05 31-Mar-05 1-Feb-06 28-Feb-06 . . . . . . . . . . . . 2005
7 1-Mar-05 31-May-05 1-Jan-06 28-Feb-06 1-Feb-07 28-Feb-07 . . . . . . . . . . 2005
8 1-Jun-05 30-Jun-05 1-Aug-05 30-Sep-05 1-May-06 30-Jun-06 1-Nov-09 31-Dec-09 1-May-11 30-Nov-11 1-Jan-13 28-Feb-13 1-Dec-13 30-Nov-14 1-Mar-15 31-May-15 2005
9 1-Jun-04 30-Jun-04 1-Sep-04 31-Oct-04 1-May-05 30-Jun-05 1-Sep-06 30-Nov-06 1-Feb-07 30-Nov-07 1-Nov-08 31-Dec-08 1-Apr-09 30-Apr-09 . . 2005
10 1-May-05 31-Jul-05 1-Apr-06 31-May-06 1-Apr-09 31-May-09 1-Apr-10 30-Apr-10 1-Apr-11 30-Jun-11 1-Mar-12 30-Apr-12 1-Mar-13 31-Mar-13 1-Dec-16 31-Mar-17 2018
;
DATA Want;
SET Have;
ARRAY e_beg elig_beg: ;
ARRAY e_end elig_end: ;
DO i=1 TO DIM(e_beg) UNTIL(Enrolled);
EnrolledA = YEAR(e_beg[i]) <= Year <= YEAR(e_end[i]);
EnrolledB = INTNX('MONTH',MDY(1,1,Year),-3) <= (e_beg[i]) <= INTNX('MONTH',MDY(12,31,Year),3);
EnrolledC = INTNX('MONTH',MDY(1,1,Year),-3) <= (e_end[i]) <= INTNX('MONTH',MDY(12,31,Year),3);
IF EnrolledA=1 OR EnrolledB=1 OR EnrolledC=1 THEN Enrolled=1;
END;
MARK= COALESCE(Enrolled,0);
DROP i Enrolled EnrolledA EnrolledB EnrolledC;
RUN;
PROC PRINT; RUN;
From what I understood, to identify the 'border month' related accounts, so long as the beginning date or end date (or both) are within 3 months prior or 3 months after the diagnosis year, then they would receive a 'Mark' value of 1.
For example, observation 3:
Diagnosed = 2005
Elig_beg3 = 01NOV2004
Elig_end3 = 31DEC2004
So, since 3 months prior to 2005 would include October 2004 and later, this account should be updated with 'Mark' value of 1 (with the new logic provided above). There may be easier ways to go about it, but I did some validation and was able to confirm this was performing as needed.
One thing that confused me was your comment, that when the new logic is implemented, all 10 account examples should receive a 'Mark' value of 1. However, I did not find that to be the case.
Examples: Observation 5 & Observation 10
5 | 01MAY2002 | 31MAY2002 | 01MAR2003 | 31MAY2003 | 01OCT2003 | 30NOV2003 | 01JAN2004 | 29FEB2004 | 01JAN2004 | 30APR2004 | 01APR2006 | 31MAY2006 | 01OCT2006 | 30NOV2006 | . | . | 2005 | 0 |
10 | 01MAY2005 | 31JUL2005 | 01APR2006 | 31MAY2006 | 01APR2009 | 31MAY2009 | 01APR2010 | 30APR2010 | 01APR2011 | 30JUN2011 | 01MAR2012 | 30APR2012 | 01MAR2013 | 31MAR2013 | 01DEC2016 | 31MAR2017 | 2018 | 0 |
Obs5 has a diagnosis of 2005. But none of the eligibility pairs listed fall anywhere between October 2004 and March 2006
Obs10 has a diagnosis of 2018. But none of the eligibility pairs listed fall anywhere between October 2017 and March 2019
So if I understood the requested criteria correctly, these two observations should not receive a 'Mark' value of 1.
Hopefully this helps
Tom Sap
DATA Have;
FORMAT id 8. elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 date9. YEAR 4.;
INFORMAT elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 date9. YEAR 4.;
INPUT ID elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 YEAR;
CARDS;
1 1-Nov-03 30-Nov-03 1-Jan-04 31-Jul-04 1-Oct-04 31-Dec-04 1-Jun-05 31-Jul-05 1-Oct-05 31-Jan-06 1-Feb-11 30-Jun-11 1-Jan-14 28-Feb-14 1-Jan-15 31-Jan-17 2005
2 1-Jan-02 31-Jan-02 1-Dec-05 28-Feb-13 1-Dec-16 31-Dec-16 1-Oct-17 30-Nov-17 . . . . . . . . 2005
3 1-Jan-02 31-Jan-02 1-Aug-02 30-Sep-02 1-Nov-04 31-Dec-04 1-Jul-11 31-Aug-11 . . . . . . . . 2005
4 1-Aug-05 31-Dec-06 1-Dec-09 31-Dec-10 . . . . . . . . . . . . 2005
5 1-May-02 31-May-02 1-Mar-03 31-May-03 1-Oct-03 30-Nov-03 1-Jan-04 29-Feb-04 1-Jan-04 30-Apr-04 1-Apr-06 31-May-06 1-Oct-06 30-Nov-06 . . 2005
6 1-Mar-05 31-Mar-05 1-Feb-06 28-Feb-06 . . . . . . . . . . . . 2005
7 1-Mar-05 31-May-05 1-Jan-06 28-Feb-06 1-Feb-07 28-Feb-07 . . . . . . . . . . 2005
8 1-Jun-05 30-Jun-05 1-Aug-05 30-Sep-05 1-May-06 30-Jun-06 1-Nov-09 31-Dec-09 1-May-11 30-Nov-11 1-Jan-13 28-Feb-13 1-Dec-13 30-Nov-14 1-Mar-15 31-May-15 2005
9 1-Jun-04 30-Jun-04 1-Sep-04 31-Oct-04 1-May-05 30-Jun-05 1-Sep-06 30-Nov-06 1-Feb-07 30-Nov-07 1-Nov-08 31-Dec-08 1-Apr-09 30-Apr-09 . . 2005
10 1-May-05 31-Jul-05 1-Apr-06 31-May-06 1-Apr-09 31-May-09 1-Apr-10 30-Apr-10 1-Apr-11 30-Jun-11 1-Mar-12 30-Apr-12 1-Mar-13 31-Mar-13 1-Dec-16 31-Mar-17 2018
;
DATA Want;
SET Have;
ARRAY e_beg elig_beg: ;
ARRAY e_end elig_end: ;
DO i=1 TO DIM(e_beg) UNTIL(Enrolled);
EnrolledA = YEAR(e_beg[i]) <= Year <= YEAR(e_end[i]);
EnrolledB = INTNX('MONTH',MDY(1,1,Year),-3) <= (e_beg[i]) <= INTNX('MONTH',MDY(12,31,Year),3);
EnrolledC = INTNX('MONTH',MDY(1,1,Year),-3) <= (e_end[i]) <= INTNX('MONTH',MDY(12,31,Year),3);
IF EnrolledA=1 OR EnrolledB=1 OR EnrolledC=1 THEN Enrolled=1;
END;
MARK= COALESCE(Enrolled,0);
DROP i Enrolled EnrolledA EnrolledB EnrolledC;
RUN;
Results:
id elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 YEAR MARK
1 01NOV2003 30NOV2003 01JAN2004 31JUL2004 01OCT2004 31DEC2004 01JUN2005 31JUL2005 01OCT2005 31JAN2006 01FEB2011 30JUN2011 01JAN2014 28FEB2014 01JAN2015 31JAN2017 2005 1
2 01JAN2002 31JAN2002 01DEC2005 28FEB2013 01DEC2016 31DEC2016 01OCT2017 30NOV2017 . . . . . . . . 2005 1
3 01JAN2002 31JAN2002 01AUG2002 30SEP2002 01NOV2004 31DEC2004 01JUL2011 31AUG2011 . . . . . . . . 2005 1
4 01AUG2005 31DEC2006 01DEC2009 31DEC2010 . . . . . . . . . . . . 2005 1
5 01MAY2002 31MAY2002 01MAR2003 31MAY2003 01OCT2003 30NOV2003 01JAN2004 29FEB2004 01JAN2004 30APR2004 01APR2006 31MAY2006 01OCT2006 30NOV2006 . . 2005 0
6 01MAR2005 31MAR2005 01FEB2006 28FEB2006 . . . . . . . . . . . . 2005 1
7 01MAR2005 31MAY2005 01JAN2006 28FEB2006 01FEB2007 28FEB2007 . . . . . . . . . . 2005 1
8 01JUN2005 30JUN2005 01AUG2005 30SEP2005 01MAY2006 30JUN2006 01NOV2009 31DEC2009 01MAY2011 30NOV2011 01JAN2013 28FEB2013 01DEC2013 30NOV2014 01MAR2015 31MAY2015 2005 1
9 01JUN2004 30JUN2004 01SEP2004 31OCT2004 01MAY2005 30JUN2005 01SEP2006 30NOV2006 01FEB2007 30NOV2007 01NOV2008 31DEC2008 01APR2009 30APR2009 . . 2005 1
10 01MAY2005 31JUL2005 01APR2006 31MAY2006 01APR2009 31MAY2009 01APR2010 30APR2010 01APR2011 30JUN2011 01MAR2012 30APR2012 01MAR2013 31MAR2013 01DEC2016 31MAR2017 2018 0
If I understood the new criteria that was needed to update the Mark variable for the 'border' qualifying accounts, all 10 patients would not end up with a 'Mark' value of 1. Observation 5 and Observation 10 do not have beginning or ending dates that fall in within the new window of time, including the additional 3 months before/after diagnosis year.
Hope this helps,
Tom Sap
DATA Have;
FORMAT id 8. elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 date9. YEAR 4.;
INFORMAT elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 date9. YEAR 4.;
INPUT ID elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 YEAR;
CARDS;
1 1-Nov-03 30-Nov-03 1-Jan-04 31-Jul-04 1-Oct-04 31-Dec-04 1-Jun-05 31-Jul-05 1-Oct-05 31-Jan-06 1-Feb-11 30-Jun-11 1-Jan-14 28-Feb-14 1-Jan-15 31-Jan-17 2005
2 1-Jan-02 31-Jan-02 1-Dec-05 28-Feb-13 1-Dec-16 31-Dec-16 1-Oct-17 30-Nov-17 . . . . . . . . 2005
3 1-Jan-02 31-Jan-02 1-Aug-02 30-Sep-02 1-Nov-04 31-Dec-04 1-Jul-11 31-Aug-11 . . . . . . . . 2005
4 1-Aug-05 31-Dec-06 1-Dec-09 31-Dec-10 . . . . . . . . . . . . 2005
5 1-May-02 31-May-02 1-Mar-03 31-May-03 1-Oct-03 30-Nov-03 1-Jan-04 29-Feb-04 1-Jan-04 30-Apr-04 1-Apr-06 31-May-06 1-Oct-06 30-Nov-06 . . 2005
6 1-Mar-05 31-Mar-05 1-Feb-06 28-Feb-06 . . . . . . . . . . . . 2005
7 1-Mar-05 31-May-05 1-Jan-06 28-Feb-06 1-Feb-07 28-Feb-07 . . . . . . . . . . 2005
8 1-Jun-05 30-Jun-05 1-Aug-05 30-Sep-05 1-May-06 30-Jun-06 1-Nov-09 31-Dec-09 1-May-11 30-Nov-11 1-Jan-13 28-Feb-13 1-Dec-13 30-Nov-14 1-Mar-15 31-May-15 2005
9 1-Jun-04 30-Jun-04 1-Sep-04 31-Oct-04 1-May-05 30-Jun-05 1-Sep-06 30-Nov-06 1-Feb-07 30-Nov-07 1-Nov-08 31-Dec-08 1-Apr-09 30-Apr-09 . . 2005
10 1-May-05 31-Jul-05 1-Apr-06 31-May-06 1-Apr-09 31-May-09 1-Apr-10 30-Apr-10 1-Apr-11 30-Jun-11 1-Mar-12 30-Apr-12 1-Mar-13 31-Mar-13 1-Dec-16 31-Mar-17 2018
;
DATA Want;
SET Have;
ARRAY e_beg elig_beg: ;
ARRAY e_end elig_end: ;
DO i=1 TO DIM(e_beg) UNTIL(Enrolled);
EnrolledA = YEAR(e_beg[i]) <= Year <= YEAR(e_end[i]);
EnrolledB = INTNX('MONTH',MDY(1,1,Year),-3) <= (e_beg[i]) <= INTNX('MONTH',MDY(12,31,Year),3);
EnrolledC = INTNX('MONTH',MDY(1,1,Year),-3) <= (e_end[i]) <= INTNX('MONTH',MDY(12,31,Year),3);
IF EnrolledA=1 OR EnrolledB=1 OR EnrolledC=1 THEN Enrolled=1;
END;
MARK= COALESCE(Enrolled,0);
DROP i Enrolled EnrolledA EnrolledB EnrolledC;
RUN;
PROC PRINT; RUN;
Results:
id elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 YEAR MARK
1 01NOV2003 30NOV2003 01JAN2004 31JUL2004 01OCT2004 31DEC2004 01JUN2005 31JUL2005 01OCT2005 31JAN2006 01FEB2011 30JUN2011 01JAN2014 28FEB2014 01JAN2015 31JAN2017 2005 1
2 01JAN2002 31JAN2002 01DEC2005 28FEB2013 01DEC2016 31DEC2016 01OCT2017 30NOV2017 . . . . . . . . 2005 1
3 01JAN2002 31JAN2002 01AUG2002 30SEP2002 01NOV2004 31DEC2004 01JUL2011 31AUG2011 . . . . . . . . 2005 1
4 01AUG2005 31DEC2006 01DEC2009 31DEC2010 . . . . . . . . . . . . 2005 1
5 01MAY2002 31MAY2002 01MAR2003 31MAY2003 01OCT2003 30NOV2003 01JAN2004 29FEB2004 01JAN2004 30APR2004 01APR2006 31MAY2006 01OCT2006 30NOV2006 . . 2005 0
6 01MAR2005 31MAR2005 01FEB2006 28FEB2006 . . . . . . . . . . . . 2005 1
7 01MAR2005 31MAY2005 01JAN2006 28FEB2006 01FEB2007 28FEB2007 . . . . . . . . . . 2005 1
8 01JUN2005 30JUN2005 01AUG2005 30SEP2005 01MAY2006 30JUN2006 01NOV2009 31DEC2009 01MAY2011 30NOV2011 01JAN2013 28FEB2013 01DEC2013 30NOV2014 01MAR2015 31MAY2015 2005 1
9 01JUN2004 30JUN2004 01SEP2004 31OCT2004 01MAY2005 30JUN2005 01SEP2006 30NOV2006 01FEB2007 30NOV2007 01NOV2008 31DEC2008 01APR2009 30APR2009 . . 2005 1
10 01MAY2005 31JUL2005 01APR2006 31MAY2006 01APR2009 31MAY2009 01APR2010 30APR2010 01APR2011 30JUN2011 01MAR2012 30APR2012 01MAR2013 31MAR2013 01DEC2016 31MAR2017 2018 0
If I understood the new criteria correctly, all 10 accounts would not receive the value 1 for 'Mark'.
Observation 5 and Observation 10 do not have eligibility date values within the window of 3 months prior to diagnosis year and 3 months after diagnosis year.
Hope this helps,
Tom Sap
https://communities.sas.com/t5/SAS-Communities-Library/Working-with-Dates-and-Times-in-SAS-Tutorial/... has a PDF with much information about dates.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.