Help using Base SAS procedures

Matching date in date-range

Reply
Occasional Contributor
Posts: 5

Matching date in date-range

Dear fellow SAS-users,

This is very likely to be a dreadfully unintelligent question on my part with a (hopefully) trivial answer, but please bear with me as I am mainly used to other programming languages and failry new to SAS. Altough the actual problem is somewhat more elaborate, I can simplify and reduce it to the following: I have table t1 containing a very large list of individuals and the dates they were on holiday in the format [ID] [HOLIDAY_STARTDATE] [HOLIDAY_ENDDATE]. In table t2, I have a list of every date in a given year with a column of 0's, as in [DATE_DAY][ONHOLIDAY]. Now what I would like to do is 1) check for each individual in t1 and each dat in t2 whether or not the particular date is within the range of an individual's holiday and if so 2) set the corresponding [ONHOLIDAY] value to 1. The end product should look something like this:

[ID]      [DATE]          [ONHOLIDAY]

1          01/01/2014     0

1          02/01/2014     0

1          03/01/2014     1

1          04/01/2014     1

1          05/01/2014     0

...

ID     31/12/2014     0

Effectively creating 365/366 entries for each individual having a 1 for each day they were on holiday.

Now I did try the following (although quite amateuristic) in PROC SQL, given the "zero-vector" per day (t1) and individual holiday dates (t2):

proc sql;

     create table t3 as

           select distinct ID, DATE,

                case

                     when DATE between HOLIDAY_STARTDATE and HOLIDAY_ENDDATE then 1

                     else 0

                end

           as ONHOLIDAY

                from t1, t2

                     group by DATE having (ONHOLIDAY = max(ONHOLIDAY))

                           order by DATE, ONHOLIDAY;

quit;

This works fine for small numbers of individuals, however when attempting to run this against the large database I have, I get error 5430 (http://support.sas.com/kb/5/430.html) , essentially telling me my query was too complex (note that I cannot change the cfg-files on my company laptop so I am stuck with this error unless I can optimize the query, also compression did not help).

Now, I would like to know whether there is an alternative to this. I am used to fairly easily writing loops and do work on matrices for this kind of simple problem (i.e. MATLAB), but SAS necessitates a different way of thinking, hence my question. Any suggestions are more than welcome!

Thanks a lot in advance!

Super User
Posts: 9,671

Re: Matching date in date-range

How big are these two tables ?

data t1;
input id start : ddmmyy10. end : ddmmyy10.;
format start  end  ddmmyy10.;
cards;
1 03/01/2014 04/01/2014
;
run;
data t2;
input id DATE : ddmmyy10.;
format  date ddmmyy10.;
cards;
1          01/01/2014     
1          02/01/2014     
1          03/01/2014    
1          04/01/2014    
1          05/01/2014   
;
run;

data key;
 set t1;
 do date=start to end;
  output;
 end;
 drop start end;
run;
data want;
 merge t2 key(in=inkey);
 by id date;
 ONHOLIDAY=ifn(inkey,1,0);
run;



Xia Keshan

Occasional Contributor
Posts: 5

Re: Matching date in date-range

Hi,

I would say about 500-1000Mb per year of data for the holiday statistics (the day vector itself is likely to be very small, their "product" by contrast may become much larger).

Thanks for the reply!

-Alexander

Super User
Posts: 6,928

Re: Matching date in date-range

How about this:

data have;

infile cards;

input id startdate enddate;

informat startdate enddate ddmmyy10.;

format startdate enddate ddmmyy10.;

cards;

1 10/01/2014 12/01/2014

1 15/06/2014 20/06/2014

2 01/07/2014 31/07/2014

;

run;

proc sort data=have;

by id startdate;

run;

data want (keep=id date onholiday);

set have;

by id;

retain date;

format date ddmmyy10.;

if first.id then date = '01jan2014'd;

onholiday = 0;

do until (date >= startdate);

  output;

  date = date + 1;

end;

onholiday = 1;

do date = startdate to enddate;

  output;

end;

if last.id

then do;

  onholiday = 0;

  do until (date >= '31dec2014'd + 1);

    output;

    date = date + 1;

  end;

end;

run;

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Ask a Question
Discussion stats
  • 3 replies
  • 627 views
  • 0 likes
  • 3 in conversation