BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
D_Z_
Obsidian | Level 7

Good Day All!

 

I am having a bit of a memory block on this one.

 

What i have essentially is a list of users and a a seperate record and timestamp each time they changed thier status in the production system we are using.   What i am looking to do in the end is get a total time of all statuses that one of the users had during the day, then skip the very last status for that agent as the next record will have a different agent name... this is the part that is confusing me... 

 

At the end... i want to add up the statuses to get a total time in each status... so like i can see available time (waiting for an issue)... working (working on an issue) and unavailable (not working or waiting for an issue)

 

Can anyone give me some help in what i need to do to accomplish this?  It's straight forward data but i just can't figure out the right combination of code to make it happen.

 

Thanks,

Dean

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

As long as other variables are not particularly important, the simplest way to compute this would be to calculate based on the ending record, not the starting record.  For example:

 

proc sort data=have;

by user timestamp;

run;

 

data want;

set have;

by user;

prior_status = lag(status);

duration = dif(timestamp);

if first.user=0;

keep user prior_status duration;

run;

View solution in original post

7 REPLIES 7
Astounding
PROC Star

As long as other variables are not particularly important, the simplest way to compute this would be to calculate based on the ending record, not the starting record.  For example:

 

proc sort data=have;

by user timestamp;

run;

 

data want;

set have;

by user;

prior_status = lag(status);

duration = dif(timestamp);

if first.user=0;

keep user prior_status duration;

run;

D_Z_
Obsidian | Level 7

Thanks for that... it is some awesome code... 

 

so i have data that looks like this:

 

10MAY16:07:58:14	Available	0:09:44
10MAY16:08:00:53	Unavailable	0:02:39
10MAY16:08:08:20	Available	0:07:28
10MAY16:08:52:18	Work In Progress	0:43:57
10MAY16:08:55:18	Available	0:03:00
10MAY16:09:06:17	Work In Progress	0:10:59
10MAY16:09:36:49	Unavailable	0:30:32
10MAY16:10:25:34	Available	0:48:44
10MAY16:10:55:53	Unavailable	0:30:19
10MAY16:11:10:54	Available	0:15:02
10MAY16:11:15:18	Work In Progress	0:04:23
10MAY16:12:23:02	Unavailable	1:07:44
10MAY16:12:32:34	Available	0:09:32
10MAY16:12:54:10	Work In Progress	0:21:37
10MAY16:13:00:19	Available	0:06:09
10MAY16:13:33:23	Work In Progress	0:33:03
10MAY16:14:23:41	Unavailable	0:50:19
10MAY16:14:34:24	Available	0:10:42
10MAY16:14:38:09	Work In Progress	0:03:45
10MAY16:14:52:01	Available	0:13:53

So the general idea is that i can sum up available times, unavailable times and work in progress times and get an accurate #... however.... the correct data is one line down if that makes sense.... so on the very first line... the available time i want to be 2:39 instead of the 2nd line for the duration... 

 

I am not sure how to do this... but i am going to learn it tonight i have a feeling so i can do this in the future... its really cool.  I totally appreciate your help.

 

Dean

ballardw
Super User

I think you need to show more detail as I am not sure how you are intending to use the "second line". Show the values such as in arithmetic and the expected result for a few lines of the data.

D_Z_
Obsidian | Level 7

So  basically what i am doing is starting from a list that shows me user, status, timestamp (that user went into said status)... then i have 3 status catagories  (Available, Unavailable, and Work in Progress).

 

What i want to do is calculate the total sum of each status by user.  The first step was to understand how to get the differences between the timestamps and into the correct fields so i can do the calculation afterwards.  The the last record for each user should be blank by default because there is not another login time below it for the same user.

 

Now the code above works great... but the data is on the wrong line... the "Is" column as shown below generates the calculation and the "Should Be" is what i am looking for it to read.  So in essence, off by one line. and there was data in the very last record on each agent... which i want to keep blank.

 

Taking the first 2 lines of the table as an example... john was available agt 7:58:14, and then went unavailable at 8:00:53... which means that John was Available for 2.6 minutes.... so in the line one column 3 (Available) i want to see john as available for 2.6 minutes., not 9.7 minutes... but the 2.6 minutes in the "Is" column is on line 2.

 

Agent Timestamp Prior_Status Is Should Be
John Doe 10MAY16:07:58:14 Available 9.7 2.6
John Doe 10MAY16:08:00:53 Unavailable 2.6 7.5
John Doe 10MAY16:08:08:20 Available 7.5 44
John Doe 10MAY16:08:52:18 Work In Progress 44 3
John Doe 10MAY16:08:55:18 Available 3 11
John Doe 10MAY16:09:06:17 Work In Progress 11 30.5
John Doe 10MAY16:09:36:49 Unavailable 30.5 48.7
John Doe 10MAY16:10:25:34 Available 48.7 30.3
John Doe 10MAY16:10:55:53 Unavailable 30.3 15
John Doe 10MAY16:11:10:54 Available 15 4.4
John Doe 10MAY16:11:15:18 Work In Progress 4.4 67.7
John Doe 10MAY16:12:23:02 Unavailable 67.7 --

 

 

ballardw
Super User

Quick and (very) dirty: Sort the data by agent and descending timestamp; Use @Astounding's code but muliply the dif by -1 then resort the result by agent and timestamp.

Though I might look for something a bit more elegant if there were millions of records involved. Maybe.

 

Astounding
PROC Star

I think the code I gave you originally should be fine, as long as you focus on the calculated variable named DURATION.  The variable named LS might be the one that was already in your data, but should be ignored.

 

At any rate, if you have additional variables that you want to keep around, here is another way.

 

data want;

set have;

by id;

if last.id=0 then do;

   _n_ = _n_ + 1;

   set have (keep=LS rename=(LS=replacement_LS)) point=_n_;

end;

else replacement_LS=.;

run;

 

This code moves the duration back to the previous record, using the variable name REPLACEMENT_LS.

D_Z_
Obsidian | Level 7

Thanks Astounding!  I feel like an idiot though... our original code worked perfectly... i had to go back to the raw data before it was imported to check something and realized that i was wrong... everything was in the right place... I can't believe i missed it... 

 

Everything is good to go... thanks for the outstanding code... you saved my life and made my VP very happy.  She has been after this data for over a year and knew there was a way to do it... just had to reach out.... thanks again!

 

Dean

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 960 views
  • 0 likes
  • 3 in conversation