Dear SAS experts,
I'd like find out how long it took patients to get on the insurance coverage again.
My data has date of diganosis (date_diagnosis) and date pairs where each pair measures the length of individual insurance coverage intervals.
I'm trying to create a column which takes 0 when date_diagnosis falls inside any of date pairs and number of days, in other words, distance between date_diagnosis and the beginning of the next nearest insurance coverage as shown in the image,
For example, pat_id=6 was diagnosed for a condition in 15SEP2005. However, it was not contained in both date pairs shown in the image below. My goal is to calculate the distance between date_diagnosis and elig_beg2. AKA, the time until next insurance coverage. Output variable would be a column taking 0 through continuous numbers associated with a time until patient's next insurance coverage.
DATA HAVE;
format id 8. DATE_DIAGNOSIS elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 date9.;
informat DATE_DIAGNOSIS elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8 date9.;
input ID DATE_DIAGNOSIS elig_beg1 elig_end1 elig_beg2 elig_end2 elig_beg3 elig_end3 elig_beg4 elig_end4 elig_beg5 elig_end5 elig_beg6 elig_end6 elig_beg7 elig_end7 elig_beg8 elig_end8;
cards;
1 29MAR2005 01NOV2003 30NOV2003 01JAN2004 31JUL2004 01OCT2004 31DEC2004 01JUN2005 31JUL2005 01OCT2005 31JAN2006 01FEB2011 30JUN2011 01JAN2014 28FEB2014 01JAN2015 31JAN2017
2 08JUN2005 01JAN2002 31JAN2002 01DEC2005 28FEB2013 01DEC2016 31DEC2016 01OCT2017 30NOV2017 . . . . . . . .
3 24JUN2005 01JAN2002 31JAN2002 01AUG2002 30SEP2002 01NOV2004 31DEC2004 01JUL2011 31AUG2011 . . . . . . . .
4 18JUL2005 01AUG2005 31DEC2006 01DEC2009 31DEC2010 . . . . . . . . . . . .
5 15AUG2005 01MAY2002 31MAY2002 01MAR2003 31MAY2003 01OCT2003 30NOV2003 01JAN2004 29FEB2004 01APR2004 30APR2004 01APR2006 31MAY2006 01OCT2006 30NOV2006 . .
6 15SEP2005 01MAR2005 31MAR2005 01FEB2006 28FEB2006 . . . . . . . . . . . .
7 17OCT2005 01MAR2005 31MAY2005 01JAN2006 28FEB2006 01FEB2007 28FEB2007 . . . . . . . . . .
8 30OCT2005 01JUN2005 30JUN2005 01AUG2005 30SEP2005 01MAY2006 30JUN2006 01NOV2009 31DEC2009 01MAY2011 30NOV2011 01JAN2013 28FEB2013 01DEC2013 30NOV2014 01MAR2015 31MAY2015
9 10NOV2005 01JUN2004 30JUN2004 01SEP2004 31OCT2004 01MAY2005 30JUN2005 01SEP2006 30NOV2006 01FEB2007 30NOV2007 01NOV2008 31DEC2008 01APR2009 30APR2009 . .
10 17NOV2018 01MAY2005 31JUL2005 01APR2006 31MAY2006 01APR2009 31MAY2009 01APR2010 30APR2010 01APR2011 30JUN2011 01MAR2012 30APR2012 01MAR2013 31MAR2013 01DEC2016 31DEC
11 10DEC2004 01NOV2004 30JUN2008 01OCT2008 30JUN2011 . . . . . . . . . . . .
12 14DEC2004 01NOV2004 31OCT2008 . . . . . . . . . . . . . .
13 01DEC2008 01NOV2004 31OCT2008 01DEC2008 31DEC2016 . . . . . . . . . . . .
14 24DEC2004 01NOV2004 30JUN2005 . . . . . . . . . . . . . . . .
15 01DEC2004 01NOV2005 31JAN2017 . . . . . . . . . . . . . . . .
16 22DEC2005 01NOV2005 31DEC2016 . . . . . . . . . . . . . . . .
17 03DEC2009 01NOV2009 31DEC2016 01NOV2017 31DEC2017 . . . . . . . . . . . .
18 01DEC2010 01NOV2010 31JUL2011 01NOV2013 31OCT2014 . . . . . . . . . . . .
;
and a shorter version with only one loop:
data want;
length distance i 8;
set have;
array _begin[*] elig_beg:;
array _end[*] elig_end:;
do i = 1 to dim(_begin);
if _begin[i] <= date_diagnosis <= _end[i] then do;
distance = 0;
leave;
end;
else do;
if date_diagnosis < _begin[i] then do;
distance = _begin[i] - date_diagnosis;
leave;
end;
end;
end;
drop i;
run;
Can you please post what you expect as result for each observation?
The following step seems to create what you want:
data want;
length distance 8;
set have;
array _begin[*] elig_beg:;
array _end[*] elig_end:;
do i = 1 to dim(_begin);
if _begin[i] <= date_diagnosis <= _end[i] then do;
distance = 0;
leave;
end;
end;
if missing(distance) then do;
if date_diagnosis < elig_beg1 then do;
distance = date_diagnosis - elig_beg1;
end;
else do;
do i = 1 to dim(_begin) -1;
if _end[i] < date_diagnosis < _begin[i+1] then do;
distance = _begin[i+1] - date_diagnosis;
leave;
end;
end;
end;
end;
drop i;
run;
Next step: optimize the code, so to avoid the second loop.
and a shorter version with only one loop:
data want;
length distance i 8;
set have;
array _begin[*] elig_beg:;
array _end[*] elig_end:;
do i = 1 to dim(_begin);
if _begin[i] <= date_diagnosis <= _end[i] then do;
distance = 0;
leave;
end;
else do;
if date_diagnosis < _begin[i] then do;
distance = _begin[i] - date_diagnosis;
leave;
end;
end;
end;
drop i;
run;
Would you please help me create a dummy variable (0,1) whether patient had ever/never enrolled in an insurance? (date_diagnosis)=year(date pairs) .
Maybe I should create a new post?
Thanks in advance.
@Cruise wrote:
Would you please help me create a dummy variable (0,1) whether patient had ever/never enrolled in an insurance? (date_diagnosis)=year(date pairs) .
Maybe I should create a new post?
Thanks in advance.
Please re-post this as new topic. So that the community notices it.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.