Hi, I am still very new to SAS and am having a problem with figuring out the following problem.
I’d be most appreciative for any guidance on how to tackle this problem. I’ve tried using between…and, do…while, lag, and retain, but I’m missing something that’s preventing me from solving this.
Here’s a sample dataset I have:
Id unit time1 time2
1 E1 12:00:01 13:00:00
2 E2 12:15:00 13:01:00
4 E3 12:25:00 13:15:00
5 E1 13:20:00 13:45:00
6 E4 13:30:00 14:15:00
7 E2 13:50:00 14:30:00
8 E4 15:00:00 15:10:00
11 E1 15:05:00 16:00:00
Below is the dataset I’d like to end up with.
The only thing that has changed is that there is now a fifth variable (events). Events describes the number of events that are active. The events count is explained below.
Id Unit Time1 Time2 Events
1 E1 12:00:01 13:00:00 1
2 E2 12:15:00 13:19:00 2
3 E3 12:25:00 13:15:00 3
4 E1 13:20:00 13:45:00 1
5 E4 13:30:00 14:15:00 2
6 E2 13:50:00 14:30:00 2
8 E4 15:00:00 15:10:00 1
11 E1 15:05:00 16:00:00 2
The first obs has an Events value of 1 because it is the only active event.
The second Events has a value of 2 because when event 2 started event 1 was still in progress.
The third Event has value of 3 because Events 1 and 2 were still active when obs 3 began. (Time1 for the third obs is < time2 for both obs 1 and obs2).
Event four has a value of 1 because a check of the Time2(s) above show all have ended prior to time1 for the fourth obs.
The goal is to know how many concurrent events are occurring during a given period, based on time1 being less than the time2 for all the obs prior.
The following code works (kind of, but it is the closet I gotten)
Data new;
Set org;
If time1 lt time2 then Events = ‘1’;
If time1 lt time2 and time1 lt lag(time2) then Events = ‘2’;
Run;
But this is where I’m stuck. It is possible that lag(time2) has concluded, but lag2(time2) and lag3(time2) have not, and the if…then statements don’t seem to be able to handle this.
Ultimately what I need to be able to do for each time1 is to be able to compare it again against each time2 “above” it and ascertain how many active events I have going on.
My attempts with loops, retain, sum, and lagX(time2) have not even come close, but my guess is they are all needed. My thought is I need someway to create a loop that says I'm on the 10th obs and need to check time2 obs 10, 9, 8, ... with sum+1 for each time it finds a time2 that is > the time1 for obs ten.
I hope this explanation has been clear and someone can point me in the right direction.
A couple of notes: Not every 'id' is included in the dataset because not every 'id' resulted in an event. Also, some 'id's could result in multiple events. And the actual dataset has the following date time format 01Jan09:13:00:00. I'm not sure if any of this matters for the solution, but I certainly want to make this as clear as possible.
Thanks, Mike