BookmarkSubscribeRSS Feed
Mike_D
Calcite | Level 5
Hi, I am still very new to SAS and am having a problem with figuring out the following problem.

I’d be most appreciative for any guidance on how to tackle this problem. I’ve tried using between…and, do…while, lag, and retain, but I’m missing something that’s preventing me from solving this.

Here’s a sample dataset I have:

Id unit time1 time2

1 E1 12:00:01 13:00:00
2 E2 12:15:00 13:01:00
4 E3 12:25:00 13:15:00
5 E1 13:20:00 13:45:00
6 E4 13:30:00 14:15:00
7 E2 13:50:00 14:30:00
8 E4 15:00:00 15:10:00
11 E1 15:05:00 16:00:00

Below is the dataset I’d like to end up with.

The only thing that has changed is that there is now a fifth variable (events). Events describes the number of events that are active. The events count is explained below.

Id Unit Time1 Time2 Events

1 E1 12:00:01 13:00:00 1
2 E2 12:15:00 13:19:00 2
3 E3 12:25:00 13:15:00 3
4 E1 13:20:00 13:45:00 1
5 E4 13:30:00 14:15:00 2
6 E2 13:50:00 14:30:00 2
8 E4 15:00:00 15:10:00 1
11 E1 15:05:00 16:00:00 2

The first obs has an Events value of 1 because it is the only active event.

The second Events has a value of 2 because when event 2 started event 1 was still in progress.

The third Event has value of 3 because Events 1 and 2 were still active when obs 3 began. (Time1 for the third obs is < time2 for both obs 1 and obs2).

Event four has a value of 1 because a check of the Time2(s) above show all have ended prior to time1 for the fourth obs.

The goal is to know how many concurrent events are occurring during a given period, based on time1 being less than the time2 for all the obs prior.


The following code works (kind of, but it is the closet I gotten)


Data new;
Set org;

If time1 lt time2 then Events = ‘1’;
If time1 lt time2 and time1 lt lag(time2) then Events = ‘2’;

Run;


But this is where I’m stuck. It is possible that lag(time2) has concluded, but lag2(time2) and lag3(time2) have not, and the if…then statements don’t seem to be able to handle this.

Ultimately what I need to be able to do for each time1 is to be able to compare it again against each time2 “above” it and ascertain how many active events I have going on.

My attempts with loops, retain, sum, and lagX(time2) have not even come close, but my guess is they are all needed. My thought is I need someway to create a loop that says I'm on the 10th obs and need to check time2 obs 10, 9, 8, ... with sum+1 for each time it finds a time2 that is > the time1 for obs ten.

I hope this explanation has been clear and someone can point me in the right direction.

A couple of notes: Not every 'id' is included in the dataset because not every 'id' resulted in an event. Also, some 'id's could result in multiple events. And the actual dataset has the following date time format 01Jan09:13:00:00. I'm not sure if any of this matters for the solution, but I certainly want to make this as clear as possible.

Thanks, Mike
8 REPLIES 8
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Implicitly you will need to first sort your data to ensure that it is in proper sequence. My recommendation is to track your "minimum concurrent event" time with a RETAIN and a temporary SAS variable rather than using LAG.

Scott Barry
SBBWorks, Inc.
Mike_D
Calcite | Level 5
Thanks Scott,

Just to make sure I'm completely understand I should probably use Retain to capture and compare the current time1 and assign the time2 to a temporary variable and compare the temp vairable to the retain value?

Thanks again, Mike
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
For whatever your "concurent event matching" criteria, keep track of the oldest start-time (and consider that I don't know what you're doing about date-change?) and compare that time to the TIME1 and TIME2 (range) values of subsequent observations, I suppose.

You may want to map the process out on paper as a "stick figures" approach to developing a determination methodology and then apply that to the SAS language with your variables.

Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
For every observation, you want to get the number of events unfinished at time1. I would sort the data after time1 variable. I don't think you could avoid keeping sequential the time2 of the unfinished events.
Mike_D
Calcite | Level 5
Thanks to Scott and Vasile for trying to help me with this.

After battIing this for days I just ended up just doing it in Excel. If anyone ever stops by here and is fluent in both vba and sas maybe they could help me with the translation, I'd really like to know a way to tackle this in SAS. I'd hoped SAS would put an end to jumping around between Access and Excel, but I guess they all have their place.

Thanks, Mike

Sub ActiveIncidents()

Dim irow, lastrow As Integer
Dim frmla As String

lastrow = ActiveSheet.UsedRange.Rows.Count

For irow = 2 To lastrow

frmla = "=countif(M2:M" & irow & "," & """GT" & """&L" & irow & ")"
ActiveSheet.Cells(irow, 15).Formula = frmla
Next irow

'for posting "GT" used instead of greater than sign

End Sub

Message was edited by: Mike D
Patrick
Opal | Level 21
Hi Mike

The following code should do what you asked for (using the example data provided):

data have;
infile datalines truncover;
input Id unit $ time1 time. time2 time.;
format time1 time2 time8.;
datalines;
1 E1 12:00:01 13:00:00
2 E2 12:15:00 13:01:00
4 E3 12:25:00 13:15:00
5 E1 13:20:00 13:45:00
6 E4 13:30:00 14:15:00
7 E2 13:50:00 14:30:00
8 E4 15:00:00 15:10:00
11 E1 15:05:00 16:00:00
;

proc sql;
select l.Id, l.unit, l.time1, l.time2, count(*) as Event
from have as l left join have as r
on l.id >= r.id and l.time1
group by l.Id, l.unit, l.time1, l.time2
;
quit;

The assumptions are:
- The id's are ascending based on ascending start times (time1)
- There are no duplicate start times
- Data is from 1 day (not passing midnight).

Change of date: It would be safer to use datetime values instead of time values.

HTH
Patrick
Mike_D
Calcite | Level 5
Hi Patrick,

Thanks I really do want to be able to figure this out in SAS. I'll give it a shot.

Mike
deleted_user
Not applicable
data times;
input id unit $ time1 $ time2 $;
length time1 $ 8 time2 $ 8;
datalines;
1 E1 12:00:01 13:00:00
2 E2 12:15:00 13:01:00
4 E3 12:25:00 13:15:00
11 E1 13:06:00 13:09:00
;
proc sort data=times out=sort_times;
by time1;
run;



proc sql;
create table new like sort_times;
alter table new add events num;
quit;

%macro event(t, i);
%let t1=trim("&t");
data temp (keep= id unit time1 time2 events);
set sort_times;
retain events 0;
if trim(time2)>= &t1 & trim(time1)< &t1 then events+1;

if id=&i then do;
output temp;
stop;
end;

proc append base=new
data=temp;
run;
%mend;

data _null_;
set sort_times;
call symputx('timex', trim(time1));
call symputx('idx', id);
call execute ('%event(&timex, &idx)');
proc print data=new;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1040 views
  • 0 likes
  • 4 in conversation