BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MMariani
Fluorite | Level 6

Hello there SAS community and I hope to be clear enough about the problem I'm facing.

I have a dataset with many variables including these three : start_date, end_date and leaves_count.

The first variable tells the date when a person is absent from work(due to health reasons,  vacations etc), the second when it finishes the absence period while the third gives the absence duration in days. Here is a small sample of the data:

 start_date       end_date        leaves_count
 06AUG2018   10AUG2018     5
 13AUG2018   14AUG2018     2
 17AUG2018   17AUG2018     1

 What I would like is an output like this in which each absence is printed:

start_date      end_date          leaves_count
06AUG2018  10AUG2018       1
07AUG2018  10AUG2018       1
08AUG2018  10AUG2018       1
09AUG2018  10AUG2018       1
10AUG2018  10AUG2018       1
13AUG2018  14AUG2018       1
14AUG2018  14AUG2018       1
17AUG2018  17AUG2018       1

I have already a similar result but now I have a problem i.e. sometimes the absence period contains one or more days that should not be counted as an absence since they are already national holiday. I have the holiday list in another SAS dataset that is just one column similar like this:

holydays

01JAN2020
06JAN2020
25APR2020
01MAY2020

So, I would like to write a code able to compare from_date with the dates contained in the other dataset and skip the holidays if they are contained within the absence period and to be more clear here is an example to show what I'm trying to obtain:

Input

start_date      end_date           leaves_count

30DEC2019   03JAN2020        4

Output

start_date      end_date          leaves_count

30DEC2019   03JAN2020       1
31DEC2019   03JAN2020       1
02JAN2020    03JAN2020       1
03JAN2020    03JAN2020       1 

So, the code should "jump/skip" a cycle in this case and not printing 01JAN2020 but of course it should be valid for all other holidays within the list.

Hope you can give me some insights or suggestions on how to tackle this problem.

Thanks. 

P.s.

I'm working on Enterprise Guide version 8.2 (8.2.0.946) (32-bit).

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

I suppose all date are numeric sas dates with date9. format,

otherwise convert the character type dates "ddmmmyyyy" to numeric by using:

num_date = input(chat_date, date9.);
format num_date date9.;

You can achieve wanted result by next code:

proc format lib=work;
  value holyday 
  '01jan2020'd = '1'
  '06jan2020'd = '1'
  '25apr2020'd = '1'
  '01may2020'd = '1'
  other = 0
 ;run;
 
data have;
  input @1 start_date date9. 
        @11 end_date date9.
		@20 leaves_count
    ;
  format start_date end_date date9.;
cards;
06AUG2018 10AUG2018  5
13AUG2018 14AUG2018  2
17AUG2018 17AUG2018  1
; run;
  
data want(rename=(dt=start_date));
     retain dt end_date leaves_count 1;
 set have(drop=leaves_count);
     do dt=start_date to end_date by 1;
        if input(put(dt,holyday.),1.) = 0 then output;  
     end;
     keep dt end_date leaves_count;
     format dt end_date date9.;
run;

View solution in original post

6 REPLIES 6
ballardw
Super User

I don't know of any specific adjustments for this under EG.

Good new: SAS has a function that returns if a date value is a holiday.

Not so good news, it likely does not exactly align with all of yours as there are some "holidays" that are generally not treated as days off and you may have other days than the standard.

Example:

data have;
input start_date :date9.      end_date :date9.  leaves_count;
format start_date end_date date9.;
datalines;
30DEC2019   03JAN2020        4
run;

data want;
   set have;
   do start_date = start_date to end_date;
      if holidayname(start_date) = '' then do;
         count+1;
         output;
      end;
   end;
run;

01Jan is New Years Day and is a holiday most places so works.

The Holidayname function returns the name of a holiday when the given date is one. There is an Optional parameter for LOCALE, i.e. places other than the United States, to control which is the default list of Holidays returned.

 

The test I used above of Holidayname(start_date)=' ' is basically 'not a holiday' if the correct look up is performed.

 

There is a system option EVENTDS that allows you to have your programs point to a specific list of holiday events to consider. There is a procedure DATEKEYS to help you make the sort of data set that EVENTDS wants to use. This is likely a place where EG may behave slightly differently than the Base SAS I use. The DATEKEYS procedure will allow you make events that last more than one day if needed. Having the values is about 25% of the work, getting the data set in the correct format is about half. Then testing.

mklangley
Lapis Lazuli | Level 10

Is this what you're looking for?

/* Input data */
data input;
    input start_date :date9. end_date :date9. leave_count;
    format start_date end_date date9.;
    datalines;
     06AUG2018   10AUG2018     5
     13AUG2018   14AUG2018     2
     17AUG2018   17AUG2018     1
     30DEC2019   03JAN2020     5
 ;
run;

/* Then, split up this input data with one row per day */
data input_split_by_days (drop=i);
    set input;
    do i = 1 to leave_count;
        leave_count = 1;
        output;
        start_date = intnx('day', start_date, 1, 's');
    end;
run;
 
/* Holiday data */
data holidays;
    input holiday_date :date9.;
    format holiday_date date9.;
    datalines;
    01JAN2020
    06JAN2020
    25APR2020
    01MAY2020
    ;
run;

/* Join the input data (split up by days) to the holiday data. Only keep records where the start_date is not a holiday. */
proc sql;
    create table want as
    select i.*
    from input_split_by_days i
    left join holidays h
        on i.start_date = h.holiday_date
    where h.holiday_date is null;
quit;

 

 

Shmuel
Garnet | Level 18

I suppose all date are numeric sas dates with date9. format,

otherwise convert the character type dates "ddmmmyyyy" to numeric by using:

num_date = input(chat_date, date9.);
format num_date date9.;

You can achieve wanted result by next code:

proc format lib=work;
  value holyday 
  '01jan2020'd = '1'
  '06jan2020'd = '1'
  '25apr2020'd = '1'
  '01may2020'd = '1'
  other = 0
 ;run;
 
data have;
  input @1 start_date date9. 
        @11 end_date date9.
		@20 leaves_count
    ;
  format start_date end_date date9.;
cards;
06AUG2018 10AUG2018  5
13AUG2018 14AUG2018  2
17AUG2018 17AUG2018  1
; run;
  
data want(rename=(dt=start_date));
     retain dt end_date leaves_count 1;
 set have(drop=leaves_count);
     do dt=start_date to end_date by 1;
        if input(put(dt,holyday.),1.) = 0 then output;  
     end;
     keep dt end_date leaves_count;
     format dt end_date date9.;
run;
MMariani
Fluorite | Level 6

Thanks a lot, the code almost works but unfortunately I forgot some details like the fact that sometimes the period between start_date and end_date does not equal to leaves_count (yet to find why), so I might have:
start_date     end_date       leaves_count
25JUL2018 03AUG2018      8
And I want this output:
25JUL2018  03AUG2018     1
26JUL2018  03AUG2018     1
27JUL2018  03AUG2018     1
28JUL2018  03AUG2018     1
29JUL2018  03AUG2018     1
30JUL2018   03AUG2018    1
31JUL2018   03AUG2018    1
01AUG2018  03AUG2018    1
So, the cycle should be based on leaves_count that by the way can be '0.5' (half day absence) sometimes but I saw the code already handles that. I tried to modify it but with little success and I'm kinda frustratred since it should not be that hard but I guess I have still a lot to practice.
Thanks again for the code!

Shmuel
Garnet | Level 18

I understand that leaves_count is the upper limit of counting days, then try next code:

data want(rename=(dt=start_date));
     retain dt end_date count leaves_count 1 ;
 set have(rename=(leaves_count=max_count));
     count=0;
     do dt=start_date to end_date by 1;
        if input(put(dt,holyday.),1.) = 0 then do;
          count+1; 
          if count le max_count then output;  
     end;
     keep dt end_date leaves_count;
     format dt end_date date9.;
run;

 

MMariani
Fluorite | Level 6
Sorry for the late answer but I was caught up at work.
The code eventually worked, there were just some changes needed to optimize it that my team leader showed me after being able to reach me.
So thanks a lot for the advice!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1045 views
  • 4 likes
  • 4 in conversation