BookmarkSubscribeRSS Feed
calger
Fluorite | Level 6

Hello, I am trying to count the number of company-specific holidays an employee has worked between two different dates.

 

There is a Holiday table which lists the company holidays, starting from 2013 and going through 2025. So, there is 12-25-2013, 12-25-2014, etc. Then there are the holidays that change date every year, such as Thanksgiving and the day after Thanksgiving, etc. I cannot use the SAS holiday function because it includes days my company does not take off.

HOLIDAY_ID HOLIDAY_DATE

1                 12/25/2013

2                 01/01/2014

3                 05/27/2014

 

There is a table with employees working different time spans. There are more columns involved, but essentially, it looks like this:

EMP_ID  WORK_START   WORK_END_DATE

1                 12/24/2013          05/26/2014

2                  12/24/2013          06/01/2014

3                 01/06/2014           03/23/2014

 

Basically, for each employee (EMP_ID) I need the number of rows in the Holiday table equal to the number of holidays between WORK_START and WORK_END DATE. Aka "Want" table looks like

EMP_ID  WORK_START  WORK_END_DATE  HOLIDAY_COUNT

1                 12/24/2013          05/26/2014                    2

2                  12/24/2013          06/01/2014                   3

3                 01/06/2014           03/23/2014                   0

 

The employee work table is pretty large.

 

I'm sure this is something that's been solved before and is probably simple, but I cannot find a solution online. Thank you!

         

7 REPLIES 7
ChrisNZ
Tourmaline | Level 20

Like this?

proc sql;

  select EMP_ID, WORK_START, WORK_END_DATE, count(HOLIDAY_DATE) as HOLIDAY_COUNT

  from EMPLOYEE left join HOLIDAYS

  on HOLIDAY_DATE between WORK_START and WORK_END_DATE

calger
Fluorite | Level 6
Won't this result in cross join that takes forever to execute? Is there a
better way?
ChrisNZ
Tourmaline | Level 20

@calger wrote:
Won't this result in cross join that takes forever to execute? Is there a
better way?

Well the criteria don't include equijoins or other proximity conditions, so a cartesian join is unavoidable in this case.

This does not mean it takes forever. What are the volumes?  

It takes 50 seconds for 1m times 800 records on my old PC.

 

data EMP;
  do ID=1 to 1e6;
    START=ranuni(1)*20000;
    END  =START+ranuni(1)*5000;
    output;
  end;
run;
data HOL;
  do DAY=1 to 25000;
    if ranuni(1)>.97 then output;
  end;
run;
proc sql _method;
  create table WANT as 
  select ID, START, END, count(DAY) 
  from EMP left join HOL 
  on DAY between START and END
  group by ID, START, END;
quit;

 

ChrisNZ
Tourmaline | Level 20

The good thing about using an array is you can stop the cartesian match before it is fully complete, when the holidays come after the end date. For example this runs faster than the proc SQL.

data _null_;
   call symputx('num_holidays', NUM_HOLIDAYS);
   if 0 then set HOL nobs=NUM_HOLIDAYS;
run;

data WANT2;
  set EMP;
  array HOLIDAYS[&num_holidays.] _temporary_;

  if _N_ = 1 then do I = 1 to NUM_HOLIDAYS;
    set HOL nobs= NUM_HOLIDAYS;
    HOLIDAYS[I] = DAY;
  end;

  do I = 1 to dim(HOLIDAYS) until (HOLIDAYS[I] > END);  
    HOLIDAY_COUNT = sum(HOLIDAY_COUNT, START <= HOLIDAYS[I] <= END );
  end;

  keep ID START END HOLIDAY_COUNT;
run;

 

andreas_lds
Jade | Level 19

Another solution: using an array as lookup:

data _null_;
   call symputx('num_holidays', num_holidays);
   if 0 then set work.holidays nobs=num_holidays;
run;

data want;
   set work.emps;

   length holiday_count 8;

   array holidays[&num_holidays.] _temporary_;

   if _n_ = 1 then do;
      do i = 1 to num_holidays;
         set work.holidays nobs= num_holidays;
         holidays[i] = holiday_date;
      end;
   end;

   holiday_count = 0;

   do i = 1 to dim(holidays);
      holiday_count = holiday_count + (work_start <= holidays[i] <= work_end);
   end;

   drop i holiday_date holiday_id;
run;
Ksharp
Super User
data holiday;
input HOLIDAY_ID HOLIDAY_DATE : mmddyy12.;
format HOLIDAY_DATE  mmddyy10. ;
cards;
1                 12/25/2013
2                 01/01/2014
3                 05/27/2014
;

data have;
input EMP_ID  WORK_START  : mmddyy12. WORK_END_DATE : mmddyy12.;
format WORK_START WORK_END_DATE mmddyy10. ;
cards;
1                 12/24/2013          05/26/2014
2                  12/24/2013          06/01/2014
3                 01/06/2014           03/23/2014
;

data want;
 if _n_=1 then do;
  if 0 then set holiday(keep=HOLIDAY_DATE);
  declare hash h(dataset:'holiday(keep=HOLIDAY_DATE)');
  h.definekey('HOLIDAY_DATE');
  h.definedone();
 end;
set have;
HOLIDAY_COUNT=0;
do i=WORK_START to WORK_END_DATE;
 if h.check(key:i)=0 then HOLIDAY_COUNT+1;
end;
drop i  HOLIDAY_DATE;
run;
andreas_lds
Jade | Level 19

And another solution: A data-null-step is used to create a data step that does all the work.

 

data work.holidays;
   input holiday_id holiday_date mmddyy10.;
   datalines;
1 12/25/2013
2 01/01/2014
3 05/27/2014
;

data work.emps;
   length emp_id 8 work_start work_end 8;
   informat work_: mmddyy10.;
   format work_: date9.;
   input emp_id work_start work_end;
   datalines;
1 12/24/2013 05/26/2014
2 12/24/2013 06/01/2014
3 01/06/2014 03/23/2014
;

data _null_;
   set work.holidays end= jobDone;

   length dummy $ 100;

   if _n_ = 1 then do;
      call execute('data work.want; set work.emps;');
      call execute('holiday_count =');
   end;

   dummy = '';

   if _n_ > 1 then do;
      dummy = '+';
   end;

   dummy = catx(' ',  dummy, '(work_start <=', holiday_date, '<= work_end)');
   call execute(dummy);

   put '  ' dummy;
   if jobDone then do;
      call execute('; run;');
   end;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1952 views
  • 2 likes
  • 4 in conversation