BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jordani
Obsidian | Level 7

Hi Everyone,

 

I am trying to calculate the number of absence days (business days only)  for employees. I have multiple records for each employee and the records can be duplicate or have overlapping dates. I am using Alice. M Cheng method for calculating intervals and accounting for overlaps. The missing part is how to incorporate business days in the code.  I tried to incorporate the intck function but it didn't work .

 

 

The data looks as follows

 

 

ID         startdate        enddate

1         1-1-2015          1-17-2015

1          1-1-2015         1-25-2015

1          2-7-2015         2-25-2015

2          3-4-2016         5-8-2016

2          3-4-2016        5-12-2016

2          3-4-2016        5-12- 2016 

 

 

The code used to calculate intervals

 

 

proc sort data=Dates1;

by ID startdate ;

run;

proc transpose data=Dates1 out=STARTDT (drop=_name_) prefix=START;

by ID;

var startdate;

run;

proc transpose data=Dates1 out=STOPDT (drop=_name_) prefix=STOP;

by ID;

var enddate;

run;

data AE_DURH;

merge STARTDT STOPDT;

by ENROLID;

run;

 

 

data METHOD1;

retain TOTDUR;

set AE_DURH; *--- Use Horizontal Data Representation. ---*;

array START{*} START:;

array STOP{*} STOP:;

TOTDUR=STOP1-START1+1

;

do i=2 to dim(START);

do j=1 to i-1;

 

*--- Make overlapping but not embedded intervals disjoint. ---*;

if . < START(i) <= STOP(j) and STOP(i) > STOP(j) > . then

do;

START(i)=STOP(j)+1;

end;

 

*--- Embedded in previous interval. Will not contribute to duration. ---*;

else

do;

if . < START(i) <= STOP(j) and . < STOP(i) <= STOP(j) then

do;

START(i)=.;

STOP(i)=.;

end;

end;

end;

if START(i) ne . and STOP(i) ne . then DUR=STOP(i)-START(i)+1;

else DUR=0;

 

TOTDUR+DUR;

end;

label TOTDUR='DURATION |(DAYS)';

run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

You can also do this with arrays

 

data have;
input ID (startdate enddate) (:mmddyy.);
format startdate enddate yymmdd10.;
datalines;
1         1-1-2015          1-17-2015
1          1-1-2015         1-25-2015
1          2-7-2015         2-25-2015
2          3-4-2016         5-8-2016
2          3-4-2016        5-12-2016
2          3-4-2016        5-12-2016
;

proc sql;
select min(startdate), max(enddate) 
into :from, :to
from have;
quit;

data want;
array d {&from : &to} _temporary_;
call missing(of d{*});
do until(last.id);
    set have; by id;
    do i = startdate to enddate;
        if 
            weekday(i) not in (1, 7) and         /* weekends */
            not (month(i)=12 and day(i)=25) and  /* Christmas */
            not (month(i)=1 and day(i)=1)        /* New Year */
                                                 /* ... etc. */
        then d{i} = 1;
        end;
    end;
duration = sum(of d{*});
keep ID duration;
run;

just complete the list of holidays for your locale.

 

Edit: The HOLIDAY function can help you define those if you are in USA or Canada.

PG

View solution in original post

7 REPLIES 7
ballardw
Super User

Didn't work is awful vague.

Are there errors in the log?: Post the code and log in a code box opened with the {i} to maintain formatting of error messages.

No output? Post any log in a code box.

Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

Jordani
Obsidian | Level 7

Hi ballardw,

 

Thank you for your response. I have zero values for duration DUR and total duration TOTDUR. here is the log, the output and the function that I have used. The code seems to work in disjointing the overlapping periods but the formula to calculate the dates is not working.

 

P.S.  The link in your comment is not working.

data METHOD1;
retain TOTDUR;
set AE_DURH; *--- Use Horizontal Data Representation. ---*;
array START{*} START:;
array STOP{*} STOP:;
TOTDUR=intck('dtday', startdate, enddate)+1;
do i=2 to dim(START);
do j=1 to i-1;
*--- Make overlapping but not embedded intervals disjoint. ---*;
if . < START(i) <= STOP(j) and STOP(i) > STOP(j) > . then
do;
START(i)=STOP(j)+1;
end;
*--- Embedded in previous interval. Will not contribute to duration. ---*;
else
do;
if . < START(i) <= STOP(j) and . < STOP(i) <= STOP(j) then
do;
START(i)=.;
STOP(i)=.;
end;
end;
end;
if START(i) ne . and STOP(i) ne . then DUR=STOP(i)-START(i)+1;
else DUR=0;
TOTDUR+DUR;
end;
label TOTDUR='DURATION |(DAYS)';
run;

 

 

NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).

773:8 773:42 775:10 780:8 780:36 787:4 787:22 787:44 787:52

NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).

775:1 782:1 783:1

NOTE: Variable startdate  is uninitialized.

NOTE: Variable enddate is uninitialized.

NOTE: Missing values were generated as a result of performing an operation on missing values.

Each place is given by: (Number of times) at (Line):(Column).

1039 at 769:8 1039 at 769:36

NOTE: There were 1039 observations read from the data set WORK.AE_DURH.

NOTE: The data set WORK.METHOD1 has 1039 observations and 16 variables.

NOTE: DATA statement used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

 

 

 

output

 

ID_LABEL_START1START2START3START4STOP1STOP2STOP3STOP4startdateenddateijDURTOTDUR
1enddate42103.  42150.  ..5400
2enddate41625   41702   ..5400
3enddate4167041862  4167041981  ..540120
4enddate4170141905419434204441701419064194642100..545763
5enddate41807   41855   ..5400
6enddate41563   41582   ..5400
7enddate42086   42139   ..5400
ballardw
Super User

The note here is pretty indicative:

NOTE: Variable startdate  is uninitialized.

NOTE: Variable enddate is uninitialized.

Your data set AE_DURH likely does not have the variables you think it does. Your references to those two variables in the created them in the intck function reference probably created them. So TOTDUR starts off missing.

 

 

Your code

if START(i) ne . and STOP(i) ne . then DUR=STOP(i)-START(i)+1;
else DUR=0;

means that at the end of the loop dur is going to have the comparison for the last pair of stop start, which is 0 because of the ELSE for most of your records.

A possible fix for the "dur" issue is to remove the Else. Maybe. I'm not sure what you are actually attempting to accomplish.

Jordani
Obsidian | Level 7

I am sorry for the typos. I did fixed that in the code and still not getting the result that I am supposed to get. I need to calculate the total number of business  absence days per employee. In this example employee number two has 3 records, the first record has the same start date but different end date than the rest of the records. The second record is embedded in the first one, and the third records is just a duplicate for the  second record. I cannot   calculate the total number by summing the absence days for all records because they are not unique. I wanna get something like this for example

 

ID   Total duration

1      24 days

2      49 days

 

 

I was able to get the total number of days by

TOTDUR=STOP1-START1+1

But I am interested in business days only, not all days between start and end date. You can try this code and put

 

TOTDUR=STOP1-START1+1 instead of 

TOTDUR=intck('WEEKDAY', START1+1, STOP1)

 

 

 

data have;
input ID (startdate enddate) (:mmddyy.);
format startdate enddate yymmdd10.;
datalines;
1         1-1-2015          1-17-2015
1          1-1-2015         1-25-2015
1          2-7-2015         2-25-2015
2          3-4-2016         5-8-2016
2          3-4-2016        5-12-2016
2          3-4-2016        5-12-2016
;

proc sort data= have ;
by ID startdate;
run;
proc transpose data=have out=STARTDT (drop=_name_) prefix=START;
by ID;
var startdate;
run;
proc transpose data=have out=STOPDT (drop=_name_) prefix=STOP;
by ID;
var enddate;
run;
data AE_DURH;
merge STARTDT STOPDT;
by ID;
run;
data METHOD1;
retain TOTDUR;
set AE_DURH; *--- Use Horizontal Data Representation. ---*;
array START{*} START:;
array STOP{*} STOP:;
TOTDUR=intck('WEEKDAY', START1+1, STOP1); 
do i=2 to dim(START);
do j=1 to i-1;
*--- Make overlapping but not embedded intervals disjoint. ---*;
if . < START(i) <= STOP(j) and STOP(i) > STOP(j) > . then
do;
START(i)=STOP(j)+1;
end;
*--- Embedded in previous interval. Will not contribute to duration. ---*;
else
do;
if . < START(i) <= STOP(j) and . < STOP(i) <= STOP(j) then
do;
START(i)=.;
STOP(i)=.;
end;
end;
end;
if START(i) ne . and STOP(i) ne . then DUR=STOP(i)-START(i)+1;
else DUR=0;
TOTDUR+DUR;
end;
label TOTDUR='DURATION |(DAYS)';
run;
proc print data=METHOD1;
run;
PGStats
Opal | Level 21

You can also do this with arrays

 

data have;
input ID (startdate enddate) (:mmddyy.);
format startdate enddate yymmdd10.;
datalines;
1         1-1-2015          1-17-2015
1          1-1-2015         1-25-2015
1          2-7-2015         2-25-2015
2          3-4-2016         5-8-2016
2          3-4-2016        5-12-2016
2          3-4-2016        5-12-2016
;

proc sql;
select min(startdate), max(enddate) 
into :from, :to
from have;
quit;

data want;
array d {&from : &to} _temporary_;
call missing(of d{*});
do until(last.id);
    set have; by id;
    do i = startdate to enddate;
        if 
            weekday(i) not in (1, 7) and         /* weekends */
            not (month(i)=12 and day(i)=25) and  /* Christmas */
            not (month(i)=1 and day(i)=1)        /* New Year */
                                                 /* ... etc. */
        then d{i} = 1;
        end;
    end;
duration = sum(of d{*});
keep ID duration;
run;

just complete the list of holidays for your locale.

 

Edit: The HOLIDAY function can help you define those if you are in USA or Canada.

PG
Jordani
Obsidian | Level 7

Hi PGStats,

 

Thank you so much. It worked and I got the numbers.

 

Can I incorporate a formula in the array?. I wanna put index date to calculate absence days up until the index date. The index date is not fixed for all employees, instead , it is 3 months from the day of first absence. I added an extra record for employee number 2. This absence is after the index date so it shouldn't be counted.

 

if end.date < index.date  then from startdate to enddate

if startdate > index.date then from start.date to start.date

else if  from start.date to index.date.

 

data have;
input ID (startdate enddate indexdate ) (:mmddyy.);
format startdate enddate indexdate yymmdd10.;
datalines;
1         1-1-2015          1-17-2015     4-1-2015
1          1-1-2015         1-25-2015     4-1-2015
1          2-7-2015         2-25-2015     4-1-2015
2          3-4-2016         5-8-2016       6-4-2016
2          3-4-2016        5-12-2016      6-4-2016
2          3-4-2016        5-12-2016      6-4-2016
2          8-5-2016        8-17-2016      6-4-2016 
;
PGStats
Opal | Level 21

I guess you need:

 

data have;
input ID (startdate enddate indexdate ) (:mmddyy.);
format startdate enddate indexdate yymmdd10.;
datalines;
1         1-1-2015          1-17-2015     4-1-2015
1          1-1-2015         1-25-2015     4-1-2015
1          2-7-2015         2-25-2015     4-1-2015
2          3-4-2016         5-8-2016       6-4-2016
2          3-4-2016        5-12-2016      6-4-2016
2          3-4-2016        5-12-2016      6-4-2016
2          8-5-2016        8-17-2016      6-4-2016 
;

proc sql noprint;
select min(startdate), max(enddate) 
into :from, :to
from have;
quit;

data want;
array d {&from : &to} _temporary_;
call missing(of d{*});
do until(last.id);
    set have; by id;
    if startDate < indexDate then do;
        do i = startdate to min(indexDate-1, enddate);
            if 
                weekday(i) not in (1, 7) and         /* weekends */
                not (month(i)=12 and day(i)=25) and  /* Christmas */
                not (month(i)=1 and day(i)=1)        /* New Year */
                                                     /* ... etc. */
            then d{i} = 1;
            end;
        end;
    end;
duration = sum(of d{*});
keep ID duration;
run;

Check if the -1 is required in min(indexDate-1, endDate).

 

PG

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1959 views
  • 0 likes
  • 3 in conversation