BookmarkSubscribeRSS Feed
Marcusliat
Calcite | Level 5

Hi,

This is a continuation of the following threads, the second one was unanswered.

I have the following data and code (a different scenario). The fake data is just for illustration. The goal is to compute the duration by name and by date. In this example Beth works midnight till 7:30 am and come back to work 9:00 pm till midnight the same day. The gap is then between 7:30 am and 9:00 pm. The program works for Alex that has a different work schedule (no gap during the day, 7:30-3:30 pm). The program did not give me the expected result for Beth .

DATA fake;

   LENGTH status $ 6;

   INPUT @1 dt:datetime16. @18 name$

format dt datetime18.;

date = datepart(dt);

time = timepart(dt);

datalines;

04Jan11:00:21:12 Beth

04Jan11:02:19:06 Beth

04Jan11:05:00:13 Beth

04Jan11:07:12:11 Beth

04Jan11:21:10:25 Beth

04Jan11:23:26:13 Beth

05Jan11:00:23:51 Beth

05Jan11:01:30:10 Beth

05Jan11:03:12:06 Beth

05Jan11:05:06:15 Beth

05Jan11:07:23:11 Beth

05Jan11:21:30:25 Beth

05Jan11:23:28:15 Beth

04Jan11:07:43:00 Alex

04Jan11:08:12:10 Alex

04Jan11:09:40:00 Alex

04Jan11:11:20:16 Alex

04Jan11:12:25:14 Alex

04Jan11:14:20:07 Alex

;

run;

proc sort; by name date time;

data new;

     format date date9. time time8.;

    set fake;

  by name date time;

   if first.date  then do;

    if (minute(dt)>30) then newdate=dhms(date,hour(dt),0,1800);

    if (minute(dt)<=30) then newdate=dhms(date,hour(dt),0,0);

  end;

    if last.date then do;

      if (minute(dt)>30) then newdate=dhms(date,hour(dt),0,3600);

      if (minute(dt)<=30) then newdate=dhms(date,hour(dt),0,1800);

  end;

  format newdate datetime16.;

run;

proc means data=new nway noprint;

class date name;

format newdate dtdate9.;

output out=want (drop=_:) range(newdate)=duration;

run;

data want_output;

set want;

duration_time=duration;

duration=duration/60/60;

format duration;

format duration_time hhmm8.2 date date9.;

run;

The resulting output is

04JAN2011    Alex       7       7:00.00

04JAN2011    Beth    23.5    23:30.00

05JAN2011    Beth    23.5    23:30.00

The desired output is

04JAN2011    Alex       7       7:00.00

04JAN2011    Beth    10.5    10:30.00

05JAN2011    Beth    10.5    10:30.00

There are wonderful helpers in this site. I appreciate any help.

Thanks.

7 REPLIES 7
ballardw
Super User

Your second question didn't really have a valid dataset to test in the previous post.

The record with data in the other post:

19APR12:00:24:01 19APR12:00:00:00 Beth 4

was out of order and would have resulted in a much longer interval.

And your example data again

04Jan11:00:21:12 Beth

04Jan11:02:19:06 Beth

04Jan11:05:00:13 Beth

04Jan11:07:12:11 Beth -> start of a roughly 14 hour gap. WHAT IS THE EXPLICIT RULE THAT SAYS THERE

IS AN END AT AROUND 11:00:00 TO GET YOUR 10.5?

04Jan11:21:10:25 Beth -> AND THIS LOOKS LIKE THE TIME IS SUPPOSED TO BE COUNTED IN THE NEXT DAY

WHAT IS THE RULE?

04Jan11:23:26:13 Beth

If you have a separate table that shows assigned shifts this might be possible but without a general set of rules we can generate code that will work for one or two explicit cases but not a generic data set.

If you can say something like "if the gap exceeds 2 hours then consider that the start of a new interval" but it needs to be something that is valid EVERY time it occurs. The next part is which date does the total get assigned to? The date the "shift" starts or ends?

Marcusliat
Calcite | Level 5

Thanks for your reply.

In this fake data, Beth starts at midnight 04Jan11:00:00:00, get first assignment at  04Jan11:00:21:12 and last assignment in the morning was at 04Jan11:07:12:11 and leave at 04Jan11:07:30:00. Beth get back the same day (yes gap exceed 2 hours) at 04Jan21:00:00:00 (9:00 pm), get first assignment at  04Jan11:21:10:25 and last assignment at 04Jan11:23:26:13 and leave at 04Jan11:23:59:59 (midnight) the same day. In short the schedule for Beth is 9:00 pm to 7:30 am the next day. 

I can say "if the gap exceeds 2 hours then new interval". The duration is the difference between each starting date and ending date in each interval, by name and by day.

Reeza
Super User

I can say "if the gap exceeds 2 hours then new interval". The duration is the difference between each starting date and ending date in each interval, by name and by day.

That makes Beth have 3 shifts on the 4th of January. If this is acceptable then add the Shift variable to proc means.  Run proc means twice, once by date/shift and then use those results to sum by date.

data new;

     format date date9. time time8.;

    set fake;

  by name date time;

  retain shift 0;

   if first.date  then do;

    if (minute(dt)>30) then newdate=dhms(date,hour(dt),0,1800);

    if (minute(dt)<=30) then newdate=dhms(date,hour(dt),0,0);

  end;

    if last.date then do;

      if (minute(dt)>30) then newdate=dhms(date,hour(dt),0,3600);

      if (minute(dt)<=30) then newdate=dhms(date,hour(dt),0,1800);

  end;

  format newdate datetime16.;

 

  dif_time=dif(time);

  if first.date then do;

  dif_time=.;

  shift=1;

  end;

  if (dif_time/60/60)>2 then shift+1;

 

run;

Marcusliat
Calcite | Level 5

Thank you for the attempt. It is my fault if it is unclear.

It is true Beth has three shifts ( I changed the difference to exceed 5 hours to be accurate) as should be seen in newdate:

1. 04JAN11:00:00:00 till 04JAN11:07:30:00

2: 04JAN11:21:00:00 till 05JAN11:07:30:00, this shift is the bottle neck. It crosses two days.

3: 05JAN11:21:00:00 till 05JAN11:00:00:00

I still would like an output per date and per name. The desired output is

dt                 name   duration  duration_time

04JAN2011    Alex       7       7:00.00

04JAN2011    Beth    10.5    10:30.00

05JAN2011    Beth    10.5    10:30.00

Thanks,

Reeza
Super User

As I mentioned you can run two proc means and get the results you need. I'd really rather not write the whole code for you, since your getting paid for it, and more importantly it shows you understand whats happening.

ballardw
Super User

And again

leave at 04Jan11:23:59:59 (midnight) the same day

requires an assumption as that is not in the data.

With

04Jan11:23:26:13 Beth

05Jan11:00:23:51 Beth

It really doesn't look like she "left" at midnight.

Marcusliat
Calcite | Level 5

Please see my reply to Reeza.

The new variable newdata was created to display the beginning and ending time.

Beth left at midnight. The next data is the next day so she left.

04Jan11:23:26:13 Beth

05Jan11:00:23:51 Beth

It really doesn't look like she "left" at midnight.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1140 views
  • 0 likes
  • 3 in conversation