Help using Base SAS procedures

Compare two dates with a specified date range

Reply
Frequent Contributor
Posts: 107

Compare two dates with a specified date range

Hello,

I'd like compare two dates (say start date and end date for program admission and discharge, moving in and out of a city) with another two dates (usually by calendar year from Jan 1 to Dec 31, or fiscal year from April 1 of one year to March 31 of the next year). I'd like to output records that meet the criteria for each year so that I can know if they are active in a given year.

Here is an example and in this particular cases, the comparison range is April 1 to March 31. So a person's record should be outputted for each year he or she is active between April 1 of and March 31. The end date can be missing (though I could set the missing as the current date) and a person could have multiple entries and exits.

I've the following code that is doing the job, but I am hoping to simplify the code by using macro or write different code. How do I start? I learned from this forum to use the function year(intnx("YEAR.4",sdate,0)) to select fiscal year. I know there are numerous date functions but I couldn't find examples similar to what I wanted to do. Many thanks.

data test;

format id best. startdate  enddate yymmdd10.;

input id @3 startdate yymmdd10. @14 enddate yymmdd10. @25 amount;

datalines;

1 2003-04-01 2005-01-01 3

1 2007-06-01 2009-01-01 4

2 2004-01-01 2009-12-01 2

3 2003-01-01 2005-01-02 9

3 2007-01-01            4

4 2006-06-01 2009-01-01 1

5 2007-07-01 2009-01-01 0

;

run;

data  yr2002 yr2003 yr2004 yr2005 yr2006 yr2007 yr2008 yr2009 yr2010;   set test;

    if startdate <= '31Mar2003'd and (enddate >= '1Apr2002'd or enddate = .) then output yr2002;

    if startdate <= '31Mar2004'd and (enddate >= '1Apr2003'd or enddate = .) then output yr2003;

    if startdate <= '31Mar2005'd and (enddate >= '1Apr2004'd or enddate = .) then output yr2004;

    if startdate <= '31Mar2006'd and (enddate >= '1Apr2005'd or enddate = .) then output yr2005;

    if startdate <= '31Mar2006'd and (enddate >= '1Apr2005'd or enddate = .) then output yr2005;

    if startdate <= '31Mar2007'd and (enddate >= '1Apr2006'd or enddate = .) then output yr2006;

    if startdate <= '31Mar2008'd and (enddate >= '1Apr2007'd or enddate = .) then output yr2007;

run;

Results for selected years are as follows:

yr2002

id startdate    enddate amount

3 2003-01-01 2005-01-02 9

yr2003

id startdate    enddate   amount

1 2003-04-01 2005-01-01 3

2 2004-01-01 2009-12-01 2

3 2003-01-01 2005-01-02 9

yr2007

id startdate   enddate     amount

1 2007-06-01 2009-01-01 4

2 2004-01-01 2009-12-01 2

3 2007-01-01 .                4

4 2006-06-01 2009-01-01 1

5 2007-07-01 2009-01-01 0

Super User
Posts: 19,789

Re: Compare two dates with a specified date range

Here's one way.

One change I do make with this type of data is to add an enddate for those that are missing. Either the max current value or a value that's riduculous, 2099 for example depending on what I want to do.

In this case I made it today() so it would reflect current data.

Then I basically expand the data set so there's a record for each year of interest and output that to a new dataset. That dataset can then by used with by variables if necessary later on rather than have multiple datasets for each year.

data test;

format id best. startdate  enddate yymmdd10.;

input id @3 startdate yymmdd10. @14 enddate yymmdd10. @25 amount;

datalines;

1 2003-04-01 2005-01-01 3

1 2007-06-01 2009-01-01 4

2 2004-01-01 2009-12-01 2

3 2003-01-01 2005-01-02 9

3 2007-01-01            4

4 2006-06-01 2009-01-01 1

5 2007-07-01 2009-01-01 0

;

run;

data want;

    set test;

    if enddate=. then enddate=today();

    date=startdate;

    do while (date < enddate);

        *find fiscal year;

        year=year(intnx("Year.4", date, 0));

        date=intnx('Year', date, 1, 'same');

        output;

    end;

run;

proc print data=want;

where year=2002;

run;

proc print data=want;

where year=2003;

run;

proc print data=want;

where year=2007;

run;

Super User
Posts: 10,023

Re: Compare two dates with a specified date range

Your question is not really easy.

data test;
format id best. startdate  enddate yymmdd10.;
input id @3 startdate yymmdd10. @14 enddate yymmdd10. @25 amount;
datalines;
1 2003-04-01 2005-01-01 3
1 2007-06-01 2009-01-01 4
2 2004-01-01 2009-12-01 2
3 2003-01-01 2005-01-02 9
3 2007-01-01      .     4
4 2006-06-01 2009-01-01 1
5 2007-07-01 2009-01-01 0
;
run;
data want(drop=_year i);
 set test;
 if enddate=. then enddate=today();

 do i=startdate to  enddate ;
  if year(intnx('year.4',i,0)) ne _year then do;
                                   year=year(intnx('year.4',i,0));
                                   output;
                                   end;
  _year=year(intnx('year.4',i,0));
 end;
run;
proc sort data=want;by year; run;
data _null_;
if _n_ eq 1 then do;
 if 0 then set want;
 declare hash ha(multidata:'Y');
  ha.definekey('year');
  ha.definedata('id','startdate','enddate','amount');
  ha.definedone();
end;
do until(last.year);
 set want;
  by year;
  ha.add();
end;
ha.output(dataset:cats('yr',year));
ha.clear();
run;

Ksharp

Respected Advisor
Posts: 3,156

Re: Compare two dates with a specified date range

Hi,

I have withdrawn my previous post for not reading through OP's request. Here is my new try, for the dynamic approach:

/*raw data stolen from Fareeza above*/

data test;

format id best. startdate  enddate yymmdd10.;

input id @3 startdate yymmdd10. @14 enddate yymmdd10. @25 amount;

datalines;

1 2003-04-01 2005-01-01 3

1 2007-06-01 2009-01-01 4

2 2004-01-01 2009-12-01 2

3 2003-01-01 2005-01-02 9

3 2007-01-01            4

4 2006-06-01 2009-01-01 1

5 2007-07-01 2009-01-01 0

;

run;

/*to get the end year*/

proc sql;

  select max(year(intnx('year.4',enddate,0,'b'))) into :max_year from test;

  quit;

data want;

  set test;

   _s=intnx('year.4', startdate,0,'b');

   _e=ifn(missing(enddate),&max_year,year(intnx('year.4',enddate,0,'b')));

        do year=year(_s) to _e;

       output;

        end;

        drop _:;

run;

proc sort data=want;

by year;

run;

/*dynamically output data set by year without hard coding their name*/

data _null_;

  declare hash h(multidata:'y');

  h.definekey('year');

  h.definedata('id', 'startdate', 'enddate', 'amount');

  h.definedone();

  do until (last.year);

     set want;

         by year;

         rc=h.add();

  end;

h.output(dataset:'yr'||put(year,z4.));

run;

Haikuo

Ask a Question
Discussion stats
  • 3 replies
  • 865 views
  • 0 likes
  • 4 in conversation