BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Cruise
Ammonite | Level 13

Dear SAS experts and enthusiasts,

 

My objective is to select the patients who had complete health insurance for the same calendar year of the diagnosis of a certain disease (diabetes).

 

I have a data sorted by unique patient ID (single patient per row) and time intervals to indicate the time periods when patients had valid insurance coverage.

 

For example, patient Maggie was diagnosed for a diabetes in 2005. However, he/she will be eligible if the patient had a health insurance covered from Jan 1, 2005 through Dec 31, 2005.   

 

Let me use the cases from the mock data provided below,

 

Example 1: Patient with ID=5 is NOT ELIGIBLE. The patient had several gaps in his health insurance coverage with 5 time intervals: start1-end1, start2-end2, start3-end3, start4-end5 and start5-end5.  Data shows that he was diagnosed with diabetes in 2006. However, his insurance didn’t cover calendar year of 2006 completely. He had his first insurance from 1-Feb-04 thru 28-Feb-06 and had no insurance until a-Apr-07. The year of 2006 was only partially covered. Therefore, this observation needs to be eliminated.

 

Example 2: Exception applies where insurance ended due to death. For example, ID=22 is ELIGIBLE because, apparent reason of insurance end (30-Sep-05) appear to be a death (20-Sep-05). Insurance coverage goes by full month which explains why insurance ended after death happened.  

 

Example 3: ID=23 is NOT ELIGIBLE, because patient had an insurance from 1-Jan-05 thru 1-Jun-05. However, there was one-month gap in his/her insurance coverage in July 2005.

The column KEEP indicates the final decision for the selection of each patient.

I’ll greatly appreciate if you share you insights how to SAS program to solve this eligibility problem.

 

Thanks millions in advance.

 

Give me a shout if you are unsure of anything.

 

DATA ELIGIBILITY; 
INPUT ID start1 end1 start2 end2 start3 end3 start4 end4 start5 end5 DATE_DIAGNOSIS DATE_DEATH YEAR KEEP $;
Informat start1 end1 start2 end2 start3 end3 start4 end4 start5 end5 DATE_DIAGNOSIS DATE_DEATH anydtdte.;
format start1 end1 start2 end2 start3 end3 start4 end4 start5 end5 DATE_DIAGNOSIS DATE_DEATH date9.;
CARDS;
1	1-Jan-02	30-Apr-14	1-Jun-14	31-Jul-15	.	.	.	.	.	.	1-Dec-10	25-Jul-15	2010	YES
2	1-Oct-05	31-Dec-16	.	.	.	.	.	.	.	.	14-Jan-08	.	2008	YES
3	1-Dec-11	28-Feb-14	.	.	.	.	.	.	.	.	24-Oct-13	22-Feb-14	2013	YES
4	1-Jan-02	31-Dec-16	.	.	.	.	.	.	.	.	15-Nov-06	.	2006	YES
5	1-Feb-04	28-Feb-06	1-Apr-07	31-May-10	1-Nov-10	31-Aug-13	1-Oct-14	30-Sep-15	1-Mar-16	31-Dec-16	4-Jan-06	.	2006 NO	
6	1-Jan-02	31-Aug-15	1-Dec-15	31-Dec-16	.	.	.	.	.	.	28-Apr-11	.	2011	YES
7	1-Mar-04	28-Feb-05	1-Apr-06	31-Mar-08	1-Jan-13	30-Jun-16	.	.	.	.	18-Aug-14	.	2014	YES
8	1-Jan-02	31-Oct-08	1-Dec-08	31-Jan-09	1-May-09	31-Oct-13	.	.	.	.	7-Mar-13	23-Oct-13	2013	YES
9	1-Jan-02	31-Dec-16	.	.	.	.	.	.	.	.	27-Feb-09	.	2009	YES
10	1-Jan-02	31-Dec-11	.	.	.	.	.	.	.	.	10-Sep-10	17-Dec-11	2010	YES
11	1-Sep-12	30-Jun-15	.	.	.	.	.	.	.	.	7-Jun-13	.	2013	YES
12	1-Jan-02	30-Sep-08	.	.	.	.	.	.	.	.	11-Nov-05	18-Sep-08	2005	YES
13	1-Jan-02	31-May-09	.	.	.	.	.	.	.	.	7-Jun-07	8-Dec-08	2007	YES
14	1-Jan-02	31-May-10	.	.	.	.	.	.	.	.	10-Oct-06	10-May-10	2006	YES
15	1-Jan-02	31-Dec-16	.	.	.	.	.	.	.	.	21-Jan-14	.	2014	YES
16	1-Jan-02	31-Dec-04	1-Feb-09	30-Apr-10	1-Jul-10	31-Oct-10	1-Nov-12	31-Dec-16	.	.	23-Feb-09	.	2009	NO
17	1-May-05	31-Jul-05	.	.	.	.	.	.	.	.	1-Mar-05	16-Jul-05	2005	NO
18	1-Sep-05	30-Sep-05	.	.	.	.	.	.	.	.	29-Jun-05	22-Sep-05	2005	NO
19	1-Dec-04	30-Sep-06	.	.	.	.	.	.	.	.	2-Jan-05	16-Sep-06	2005	YES
20	1-Apr-05	30-Apr-07	.	.	.	.	.	.	.	.	3-Jan-05	2-Jan-11	2005	NO
21	1-Feb-05	31-Oct-05	.	.	.	.	.	.	.	.	3-Jan-05	.	2005	NO
22	1-Jan-05	30-Sep-05	.	.	.	.	.	.	.	.	3-Jan-05	20-Sep-05	2005	YES
23	1-Apr-04	31-Jul-04	1-Jan-05	30-Jun-05	1-Aug-05	31-Dec-06	.	.	.	.	4-Jan-05	10-Apr-06	2005	NO
24	1-Sep-05	30-Nov-05	.	.	.	.	.	.	.	.	5-Jan-05	7-Oct-05	2005	NO
25	1-Jul-05	31-Dec-05	.	.	.	.	.	.	.	.	7-Jan-05	15-Jun-07	2005	NO
26	1-Apr-02	30-Sep-06	1-Dec-14	29-Feb-16	.	.	.	.	.	.	7-Jan-05	17-Oct-15	2005	YES
27	1-Apr-03	31-Aug-07	.	.	.	.	.	.	.	.	10-Jan-05	25-Aug-07	2005	YES
28	1-Jan-05	31-Mar-11	.	.	.	.	.	.	.	.	10-Jan-05	5-Feb-11	2005	YES
29	1-Jan-04	28-Feb-07	1-May-07	28-Feb-13	.	.	.	.	.	.	10-Jan-05	.	2005	YES
30	1-Dec-04	28-Feb-07	.	.	.	.	.	.	.	.	11-Jan-05	9-Feb-07	2005	YES
31	1-Jan-02	31-May-02	1-Jul-02	28-Feb-03	1-May-03	31-Dec-03	1-Mar-04	30-Nov-04	1-Jan-05	31-Mar-06	12-Jan-05	14-Jan-06	2005	YES
32	1-Jan-02	31-Oct-06	1-May-08	31-Jul-10	1-Nov-10	31-Dec-16	.	.	.	.	12-Jan-05	.	2005	YES
33	1-Jan-02	31-Oct-08	.	.	.	.	.	.	.	.	12-Jan-05	9-Oct-08	2005	YES
34	1-Jan-02	31-Dec-16	.	.	.	.	.	.	.	.	13-Jan-05	.	2005	YES
35	1-Jan-05	30-Jun-12	.	.	.	.	.	.	.	.	1-Feb-05	28-Jun-12	2005	YES
36	1-Jun-04	30-Jun-05	.	.	.	.	.	.	.	.	13-Jan-05	26-Jan-06	2005	NO
37	1-Nov-02	31-Jan-07	1-Mar-07	31-Dec-16	.	.	.	.	.	.	13-Jan-05	.	2005	YES
38	1-Jun-02	31-Dec-16	.	.	.	.	.	.	.	.	15-Jan-05	.	2005	YES
39	1-Jan-02	31-Aug-06	.	.	.	.	.	.	.	.	15-Jan-05	24-Jul-06	2005	YES
40	1-Jun-05	31-May-06	1-Dec-13	31-Jan-16	.	.	.	.	.	.	15-Jan-05	.	2005	NO
;





 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

It seems eligible ID's (i.e. keep="YES") if they are insured

   from Jan 1 of the diagnosis year

   through the earlier of

      Dec 31 of the diagnosis year,

      or Date of death:

 

So initialize KEEP2='YES', then

  1. If the earliest START (i.e. start1) is too late (after Jan 1 or of diagnosis year) then keep2='NO', else
  2. If the latest END is too early (previous to death of or dec 31 of diag year) then keep2='NO', else
  3. Else if any gap (between END1 and START2 or between END2 and START3, not including the end points) has a date in the diag year and preceding death then KEEP2='NO'. 

   

DATA ELIGIBILITY; 
INPUT ID start1 end1 start2 end2 start3 end3 start4 end4 start5 end5 DATE_DIAGNOSIS DATE_DEATH YEAR KEEP $;
Informat start1 end1 start2 end2 start3 end3 start4 end4 start5 end5 DATE_DIAGNOSIS DATE_DEATH anydtdte.;
format start1 end1 start2 end2 start3 end3 start4 end4 start5 end5 DATE_DIAGNOSIS DATE_DEATH date9.;
CARDS;
1	1-Jan-02	30-Apr-14	1-Jun-14	31-Jul-15	.	.	.	.	.	.	1-Dec-10	25-Jul-15	2010	YES
2	1-Oct-05	31-Dec-16	.	.	.	.	.	.	.	.	14-Jan-08	.	2008	YES
3	1-Dec-11	28-Feb-14	.	.	.	.	.	.	.	.	24-Oct-13	22-Feb-14	2013	YES
4	1-Jan-02	31-Dec-16	.	.	.	.	.	.	.	.	15-Nov-06	.	2006	YES
5	1-Feb-04	28-Feb-06	1-Apr-07	31-May-10	1-Nov-10	31-Aug-13	1-Oct-14	30-Sep-15	1-Mar-16	31-Dec-16	4-Jan-06	.	2006 NO	
6	1-Jan-02	31-Aug-15	1-Dec-15	31-Dec-16	.	.	.	.	.	.	28-Apr-11	.	2011	YES
7	1-Mar-04	28-Feb-05	1-Apr-06	31-Mar-08	1-Jan-13	30-Jun-16	.	.	.	.	18-Aug-14	.	2014	YES
8	1-Jan-02	31-Oct-08	1-Dec-08	31-Jan-09	1-May-09	31-Oct-13	.	.	.	.	7-Mar-13	23-Oct-13	2013	YES
9	1-Jan-02	31-Dec-16	.	.	.	.	.	.	.	.	27-Feb-09	.	2009	YES
10	1-Jan-02	31-Dec-11	.	.	.	.	.	.	.	.	10-Sep-10	17-Dec-11	2010	YES
11	1-Sep-12	30-Jun-15	.	.	.	.	.	.	.	.	7-Jun-13	.	2013	YES
12	1-Jan-02	30-Sep-08	.	.	.	.	.	.	.	.	11-Nov-05	18-Sep-08	2005	YES
13	1-Jan-02	31-May-09	.	.	.	.	.	.	.	.	7-Jun-07	8-Dec-08	2007	YES
14	1-Jan-02	31-May-10	.	.	.	.	.	.	.	.	10-Oct-06	10-May-10	2006	YES
15	1-Jan-02	31-Dec-16	.	.	.	.	.	.	.	.	21-Jan-14	.	2014	YES
16	1-Jan-02	31-Dec-04	1-Feb-09	30-Apr-10	1-Jul-10	31-Oct-10	1-Nov-12	31-Dec-16	.	.	23-Feb-09	.	2009	NO
17	1-May-05	31-Jul-05	.	.	.	.	.	.	.	.	1-Mar-05	16-Jul-05	2005	NO
18	1-Sep-05	30-Sep-05	.	.	.	.	.	.	.	.	29-Jun-05	22-Sep-05	2005	NO
19	1-Dec-04	30-Sep-06	.	.	.	.	.	.	.	.	2-Jan-05	16-Sep-06	2005	YES
20	1-Apr-05	30-Apr-07	.	.	.	.	.	.	.	.	3-Jan-05	2-Jan-11	2005	NO
21	1-Feb-05	31-Oct-05	.	.	.	.	.	.	.	.	3-Jan-05	.	2005	NO
22	1-Jan-05	30-Sep-05	.	.	.	.	.	.	.	.	3-Jan-05	20-Sep-05	2005	YES
23	1-Apr-04	31-Jul-04	1-Jan-05	30-Jun-05	1-Aug-05	31-Dec-06	.	.	.	.	4-Jan-05	10-Apr-06	2005	NO
24	1-Sep-05	30-Nov-05	.	.	.	.	.	.	.	.	5-Jan-05	7-Oct-05	2005	NO
25	1-Jul-05	31-Dec-05	.	.	.	.	.	.	.	.	7-Jan-05	15-Jun-07	2005	NO
26	1-Apr-02	30-Sep-06	1-Dec-14	29-Feb-16	.	.	.	.	.	.	7-Jan-05	17-Oct-15	2005	YES
27	1-Apr-03	31-Aug-07	.	.	.	.	.	.	.	.	10-Jan-05	25-Aug-07	2005	YES
28	1-Jan-05	31-Mar-11	.	.	.	.	.	.	.	.	10-Jan-05	5-Feb-11	2005	YES
29	1-Jan-04	28-Feb-07	1-May-07	28-Feb-13	.	.	.	.	.	.	10-Jan-05	.	2005	YES
30	1-Dec-04	28-Feb-07	.	.	.	.	.	.	.	.	11-Jan-05	9-Feb-07	2005	YES
31	1-Jan-02	31-May-02	1-Jul-02	28-Feb-03	1-May-03	31-Dec-03	1-Mar-04	30-Nov-04	1-Jan-05	31-Mar-06	12-Jan-05	14-Jan-06	2005	YES
32	1-Jan-02	31-Oct-06	1-May-08	31-Jul-10	1-Nov-10	31-Dec-16	.	.	.	.	12-Jan-05	.	2005	YES
33	1-Jan-02	31-Oct-08	.	.	.	.	.	.	.	.	12-Jan-05	9-Oct-08	2005	YES
34	1-Jan-02	31-Dec-16	.	.	.	.	.	.	.	.	13-Jan-05	.	2005	YES
35	1-Jan-05	30-Jun-12	.	.	.	.	.	.	.	.	1-Feb-05	28-Jun-12	2005	YES
36	1-Jun-04	30-Jun-05	.	.	.	.	.	.	.	.	13-Jan-05	26-Jan-06	2005	NO
37	1-Nov-02	31-Jan-07	1-Mar-07	31-Dec-16	.	.	.	.	.	.	13-Jan-05	.	2005	YES
38	1-Jun-02	31-Dec-16	.	.	.	.	.	.	.	.	15-Jan-05	.	2005	YES
39	1-Jan-02	31-Aug-06	.	.	.	.	.	.	.	.	15-Jan-05	24-Jul-06	2005	YES
40	1-Jun-05	31-May-06	1-Dec-13	31-Jan-16	.	.	.	.	.	.	15-Jan-05	.	2005	NO
;


data want (drop=_:);
  set eligibility;
  array st {*} start: ;
  array en {*} end:   ;

  _np=n(of st{*});                              /* N of start/end pairs */
  _upper_date=min(mdy(12,31,year),date_death);  /* Ins must cover through this date */

  keep2='YES';                                  /* Assume eligiblity */
  if st{1} > mdy(1,1,year) then keep2='NO ';       /*minimum start is too late ...*/
  else if en{_np} < _upper_date then keep2='NO ';  /*max end precedes upper date*/

  /* Check every gap (i.e. end1:start2,  end2:start3) preceding upper date */
  else if _np>1 then do _i=1 to _np-1 while (en{_i}<_upper_date and keep2='YES'); 
    if year(en{_i}+1)=year then keep2='NO';
    else if year(st{_i+1}-1)=year then keep2='NO';
  end;

run;

 

Assumptions:

  1. Date pairs are in ascending order
  2. Date pairs are not contiguous (i.e. there is always a gap of at least one day between end of one pair and start of next pair.
    If this is not true, then add 

       if st{_i+1}-en{_i}=1 then continue;

    as the first statement in the do loop
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

7 REPLIES 7
mkeintz
PROC Star

The first modification to apply is to make your program reading in the date fields as sas date values instead of character variables. So:

  1. Drop the data variables from the length statement, so they are not stored as character variables.
  2. Take a look at the ANYDTDTE informat in the input statement to read the data into sas date variables.
  3. Use the FORMAT statement to tell sas how to display the date variables

 

Then we can more effectively help you with the programming task.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Cruise
Ammonite | Level 13
corrected my date accordingly. thanks!!!
mkeintz
PROC Star

It seems eligible ID's (i.e. keep="YES") if they are insured

   from Jan 1 of the diagnosis year

   through the earlier of

      Dec 31 of the diagnosis year,

      or Date of death:

 

So initialize KEEP2='YES', then

  1. If the earliest START (i.e. start1) is too late (after Jan 1 or of diagnosis year) then keep2='NO', else
  2. If the latest END is too early (previous to death of or dec 31 of diag year) then keep2='NO', else
  3. Else if any gap (between END1 and START2 or between END2 and START3, not including the end points) has a date in the diag year and preceding death then KEEP2='NO'. 

   

DATA ELIGIBILITY; 
INPUT ID start1 end1 start2 end2 start3 end3 start4 end4 start5 end5 DATE_DIAGNOSIS DATE_DEATH YEAR KEEP $;
Informat start1 end1 start2 end2 start3 end3 start4 end4 start5 end5 DATE_DIAGNOSIS DATE_DEATH anydtdte.;
format start1 end1 start2 end2 start3 end3 start4 end4 start5 end5 DATE_DIAGNOSIS DATE_DEATH date9.;
CARDS;
1	1-Jan-02	30-Apr-14	1-Jun-14	31-Jul-15	.	.	.	.	.	.	1-Dec-10	25-Jul-15	2010	YES
2	1-Oct-05	31-Dec-16	.	.	.	.	.	.	.	.	14-Jan-08	.	2008	YES
3	1-Dec-11	28-Feb-14	.	.	.	.	.	.	.	.	24-Oct-13	22-Feb-14	2013	YES
4	1-Jan-02	31-Dec-16	.	.	.	.	.	.	.	.	15-Nov-06	.	2006	YES
5	1-Feb-04	28-Feb-06	1-Apr-07	31-May-10	1-Nov-10	31-Aug-13	1-Oct-14	30-Sep-15	1-Mar-16	31-Dec-16	4-Jan-06	.	2006 NO	
6	1-Jan-02	31-Aug-15	1-Dec-15	31-Dec-16	.	.	.	.	.	.	28-Apr-11	.	2011	YES
7	1-Mar-04	28-Feb-05	1-Apr-06	31-Mar-08	1-Jan-13	30-Jun-16	.	.	.	.	18-Aug-14	.	2014	YES
8	1-Jan-02	31-Oct-08	1-Dec-08	31-Jan-09	1-May-09	31-Oct-13	.	.	.	.	7-Mar-13	23-Oct-13	2013	YES
9	1-Jan-02	31-Dec-16	.	.	.	.	.	.	.	.	27-Feb-09	.	2009	YES
10	1-Jan-02	31-Dec-11	.	.	.	.	.	.	.	.	10-Sep-10	17-Dec-11	2010	YES
11	1-Sep-12	30-Jun-15	.	.	.	.	.	.	.	.	7-Jun-13	.	2013	YES
12	1-Jan-02	30-Sep-08	.	.	.	.	.	.	.	.	11-Nov-05	18-Sep-08	2005	YES
13	1-Jan-02	31-May-09	.	.	.	.	.	.	.	.	7-Jun-07	8-Dec-08	2007	YES
14	1-Jan-02	31-May-10	.	.	.	.	.	.	.	.	10-Oct-06	10-May-10	2006	YES
15	1-Jan-02	31-Dec-16	.	.	.	.	.	.	.	.	21-Jan-14	.	2014	YES
16	1-Jan-02	31-Dec-04	1-Feb-09	30-Apr-10	1-Jul-10	31-Oct-10	1-Nov-12	31-Dec-16	.	.	23-Feb-09	.	2009	NO
17	1-May-05	31-Jul-05	.	.	.	.	.	.	.	.	1-Mar-05	16-Jul-05	2005	NO
18	1-Sep-05	30-Sep-05	.	.	.	.	.	.	.	.	29-Jun-05	22-Sep-05	2005	NO
19	1-Dec-04	30-Sep-06	.	.	.	.	.	.	.	.	2-Jan-05	16-Sep-06	2005	YES
20	1-Apr-05	30-Apr-07	.	.	.	.	.	.	.	.	3-Jan-05	2-Jan-11	2005	NO
21	1-Feb-05	31-Oct-05	.	.	.	.	.	.	.	.	3-Jan-05	.	2005	NO
22	1-Jan-05	30-Sep-05	.	.	.	.	.	.	.	.	3-Jan-05	20-Sep-05	2005	YES
23	1-Apr-04	31-Jul-04	1-Jan-05	30-Jun-05	1-Aug-05	31-Dec-06	.	.	.	.	4-Jan-05	10-Apr-06	2005	NO
24	1-Sep-05	30-Nov-05	.	.	.	.	.	.	.	.	5-Jan-05	7-Oct-05	2005	NO
25	1-Jul-05	31-Dec-05	.	.	.	.	.	.	.	.	7-Jan-05	15-Jun-07	2005	NO
26	1-Apr-02	30-Sep-06	1-Dec-14	29-Feb-16	.	.	.	.	.	.	7-Jan-05	17-Oct-15	2005	YES
27	1-Apr-03	31-Aug-07	.	.	.	.	.	.	.	.	10-Jan-05	25-Aug-07	2005	YES
28	1-Jan-05	31-Mar-11	.	.	.	.	.	.	.	.	10-Jan-05	5-Feb-11	2005	YES
29	1-Jan-04	28-Feb-07	1-May-07	28-Feb-13	.	.	.	.	.	.	10-Jan-05	.	2005	YES
30	1-Dec-04	28-Feb-07	.	.	.	.	.	.	.	.	11-Jan-05	9-Feb-07	2005	YES
31	1-Jan-02	31-May-02	1-Jul-02	28-Feb-03	1-May-03	31-Dec-03	1-Mar-04	30-Nov-04	1-Jan-05	31-Mar-06	12-Jan-05	14-Jan-06	2005	YES
32	1-Jan-02	31-Oct-06	1-May-08	31-Jul-10	1-Nov-10	31-Dec-16	.	.	.	.	12-Jan-05	.	2005	YES
33	1-Jan-02	31-Oct-08	.	.	.	.	.	.	.	.	12-Jan-05	9-Oct-08	2005	YES
34	1-Jan-02	31-Dec-16	.	.	.	.	.	.	.	.	13-Jan-05	.	2005	YES
35	1-Jan-05	30-Jun-12	.	.	.	.	.	.	.	.	1-Feb-05	28-Jun-12	2005	YES
36	1-Jun-04	30-Jun-05	.	.	.	.	.	.	.	.	13-Jan-05	26-Jan-06	2005	NO
37	1-Nov-02	31-Jan-07	1-Mar-07	31-Dec-16	.	.	.	.	.	.	13-Jan-05	.	2005	YES
38	1-Jun-02	31-Dec-16	.	.	.	.	.	.	.	.	15-Jan-05	.	2005	YES
39	1-Jan-02	31-Aug-06	.	.	.	.	.	.	.	.	15-Jan-05	24-Jul-06	2005	YES
40	1-Jun-05	31-May-06	1-Dec-13	31-Jan-16	.	.	.	.	.	.	15-Jan-05	.	2005	NO
;


data want (drop=_:);
  set eligibility;
  array st {*} start: ;
  array en {*} end:   ;

  _np=n(of st{*});                              /* N of start/end pairs */
  _upper_date=min(mdy(12,31,year),date_death);  /* Ins must cover through this date */

  keep2='YES';                                  /* Assume eligiblity */
  if st{1} > mdy(1,1,year) then keep2='NO ';       /*minimum start is too late ...*/
  else if en{_np} < _upper_date then keep2='NO ';  /*max end precedes upper date*/

  /* Check every gap (i.e. end1:start2,  end2:start3) preceding upper date */
  else if _np>1 then do _i=1 to _np-1 while (en{_i}<_upper_date and keep2='YES'); 
    if year(en{_i}+1)=year then keep2='NO';
    else if year(st{_i+1}-1)=year then keep2='NO';
  end;

run;

 

Assumptions:

  1. Date pairs are in ascending order
  2. Date pairs are not contiguous (i.e. there is always a gap of at least one day between end of one pair and start of next pair.
    If this is not true, then add 

       if st{_i+1}-en{_i}=1 then continue;

    as the first statement in the do loop
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Cruise
Ammonite | Level 13
amazing work!!!!!! thank you so much Mkeintz!
Cruise
Ammonite | Level 13

@mkeintz

 

When I ran the code on my actual data (N=1.2 million and 35 date pairs) I get following error and moved END to different locations with no success. Any idea? 

 

ERROR: Array subscript out of range at line 367 column 13.

 

mkeintz
PROC Star
  1. I have no idea what statement is on line 367.  You would have to show the log.

  2. N=1.2 million is not the issue.  There is something about some particular observation.  Didn't  your log show the values of all the variables for the offending observation?  Examine those and see if there is something wrong.  Perhaps the START or END variables.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Cruise
Ammonite | Level 13

Sorry, my bad! I ran the code before I deduplicate the data by patient ID. start and end variables were repeated in the duplicate data. The program ran with no problem on the unique row per patient data.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1120 views
  • 3 likes
  • 2 in conversation