## Help generating additional observations in a timeseries data set

Solved
Occasional Contributor
Posts: 9

# Help generating additional observations in a timeseries data set

Hello,

I'm hoping someone can help me with this problem. I have time series data for emergency department visits which for each visit includes the date and time of the patients arrival, the date and time of the patients departure, and the patients acuity. What I am attempting to do is perform forecasting analysis based not only on the hour in which the patient arrived but each subsequent hour that the patient remained in the emergency department. In other words I want to calculate the staff workload for each hour that a given patient is in the ED.

I've attempted to use arrays but have failed.

Any suggestions?

Kevin Armstrong

Accepted Solutions
Solution
‎05-26-2013 01:37 PM
Posts: 5,543

## Re: Help generating additional observations in a timeseries data set

There must be some missing departure times in your data, perhaps for patients which are still hospitalized or patients that died? This should be investigated. In the meantime, one way to take care of this is to ignore patients with a missing arrival or departure time (I also added some comments to the code) :

/* Get the first and last hours present in the data */
proc sql noprint;
select
intnx("DTHOUR", min(arrive_dt), 0) format=15.0,
intnx("DTHOUR", max(depart_dt), 1) format=15.0
into :dtSt, :dtEnd
from sasforum.Array_sample_5_25_13;
quit;

/* Make a list of all hours between the first and last */
data hours;
dt = &dtSt.;
do until (dt >= &dtEnd.);
output;
dt = intnx("DTHOUR", dt, 1);
end;
format dt datetime16.;
run;

/* Create a time series of the changes in workload. Arivals increase the
workload, departures decrease it. Also intersperse the hours (with no
workload changes) in the time series. Sum all the changes that occured at
a given time stamp (e.g. two patients arrive at the same time) so that
each time stamp is unique in the time series */
proc sql;
create table times as
select
dt format=datetime13.,
sum(nsg_wk) as nsg_wk
from
(select arrive_dt as dt, nsg_wk from sasforum.Array_sample_5_25_13
where arrive_dt is not missing and depart_dt is not missing
union all corresponding
select depart_dt as dt, -nsg_wk as nsg_wk from sasforum.Array_sample_5_25_13
where arrive_dt is not missing and depart_dt is not missing
union all corresponding
select dt, 0 as nsg_wk from hours)
group by dt
order by dt;
quit;

/* Compute the cumulative sum of workload changes. Calculate durations
set times;
duration = dif(dt);
start = lag(dt);
if duration > 0 then output;
format start datetime16. duration time5.;
run;

/* Calculate the hourly average of workloads, weighted by their duration */
proc sql;
select
intnx("DTHOUR", start, 0) as dt format=datetime13.,
group by calculated dt;
quit;

PG

PG

All Replies
PROC Star
Posts: 1,317

## Re: Help generating additional observations in a timeseries data set

I think we need a little more information. Could you give an example of what your input data looks like, what you'd like as a result, and some details about how to calculate the result.

Tom

Occasional Contributor
Posts: 9

## Re: Help generating additional observations in a timeseries data set

Attached is a sample of the data I'm working with. The 'los' variable represents the number of minutes that the patient spent in the ED. For the first hour, and then each subsequent hour. I would like to generate workload ( nsg_wk) for each visit but that workload needs to be associated to the correct datetime. What I need is a set of timeseries data that is based on the workload each our (nsg_wk specifically). The minutes of nsg_wk needed each hour for each visit is based on the acuity level as well as the duration of the visit.

Below is my feeble attempt at working with an array to accomplish this but it obviously doesn't work. I stopped at 8 hours but at times the length of stay is longer than that.

I'm thinking that what is needed is to generate additional observations for each subsequent hour but I don't know how to do that.

Thanks for being willing to look at this.

Kevin

data nsgtime;
set array_sample;
format hour1-hour8 datetime24.;
/*array nsg_min {8,25} ;
format hour1-hour8 datetime24.;*/
do hour1= arrive_dt;
if los > 60 <120 then hour2= arrive_dt + 3600;
else hour2= .;
if los >120 <180 then hour3= arrive_dt + 7200;
else hour3= .;
if los >180 <240 then hour4= arrive_dt + 10800;
else hour4= .;
if los >240 <300 then hour5= arrive_dt + 14400;
else hour5= .;
if los >300 <360 then hour6= arrive_dt + 18000;
else hour6= .;
if los >360 <420 then hour7= arrive_dt + 21600;
else hour7= .;
if los >420 <480 then hour8= arrive_dt + 25200;
end;
array nsg_min {8,25} hour1-hour8 obs1-obs25;
if acuity= 1 then nmin= 15;
if acuity= 2 then nmin= 15;
if acuity= 3 and hour1 then nmin= 30; else nmin= 15;
if acuity= 4 and hour1 then nmin= 45; else nmin= 30;
if acuity= 5 and hour1 then nmin= 60; else nmin= 60;
if acuity=6 and hour1 then nmin= 120; else nmin= 60;
run;

PROC Star
Posts: 1,317

## Re: Help generating additional observations in a timeseries data set

So taking your twelfth observation as an example:
- arrival time is 2:04:00, departure time is 9:09:00, acuity is 6
- So the duration of the stay is 7 hours, five minutes, or 425 minutes.

For this visit, you want to set nsg_wk to 120 for the first hour, and then to 60 for each of hours 2 through 8.

From what you're saying, I'm kind of imagining the following output. I'm creating one record per hour for each visit, so I've tagged each observation with an "event" variable to relate them.

Does this look like what you're after?

event arrive_dt hour nsg_wk
1234    02:04     1   120
1234    03:04     2   60
1234    04:04     3   60
1234    05:04     4   60
1234    06:04     5   60
1234    07:04     6   60
1234    08:04     7   60
1234    09:04     8   60

Occasional Contributor
Posts: 9

## Re: Help generating additional observations in a timeseries data set

Wow great work Tom!

Assuming that arrive_dt in your sample will include the date then yes that is exactly what I'm looking for. I should then be able to sort by arrive_dt then perform proc timeseries on nsg_wk.

Are you a SAS employee or just a good samaritan?

Thanks very much,

Kevin

PROC Star
Posts: 1,317

## Re: Help generating additional observations in a timeseries data set

Hi, Kevin

Please shoot me an email at gishpuppy.x0k@gishpuppy.com

Tom

Posts: 5,543

## Re: Help generating additional observations in a timeseries data set

Here is how to get the complete list of hourly workloads from your data. What's missing is the initial workload level, here it is assumed to be zero. The calculated values are average workloads, weighted by their duration. The time values are the beginning of hourly periods.

proc sql noprint;
select
intnx("DTHOUR", min(arrive_dt), 0) format=15.0,
intnx("DTHOUR", max(depart_dt), 1) format=15.0
into :dtSt, :dtEnd
from sasforum.Array_sample_5_25_13;
quit;

data hours;
dt = &dtSt.;
do until (dt >= &dtEnd.);
output;
dt = intnx("DTHOUR", dt, 1);
end;
format dt datetime16.;
run;

proc sql;
create table times as
select
dt format=datetime13.,
sum(nsg_wk) as nsg_wk
from
(select arrive_dt as dt, nsg_wk from sasforum.Array_sample_5_25_13
union all corresponding
select depart_dt as dt, -nsg_wk as nsg_wk from sasforum.Array_sample_5_25_13
union all corresponding
select dt, 0 as nsg_wk from hours)
group by dt
order by dt;
quit;

set times;
duration = dif(dt);
start = lag(dt);
if duration > 0 then output;
format start datetime16. duration time5.;
run;

proc sql;
select
intnx("DTHOUR", start, 0) as dt format=datetime13.,
group by calculated dt;
quit;

PG

PG
Occasional Contributor
Posts: 9

## Re: Help generating additional observations in a timeseries data set

Hello PGStats,

Thanks very much for your help. The code you provided runs well on the sample data but when I attempt to run it on a full data set it generates large, negative values for workLoad. I'm a beginner SAS programmer and have had difficulty finding the cause of the problem. I eventually found that the first observation in the TIMES data set has a dt value of '.' and a workLoad value of '-12285'. I've attached the TIMES data set so you can reviewed if interested. Any thoughts on what is causing this error? I checked the full data set and it appears to me to be in the same format as the sample data so I'm perplexed.

I'll keep trying but perhaps the answer will jump out at you?

Thanks once again for your help.

Kevin

Solution
‎05-26-2013 01:37 PM
Posts: 5,543

## Re: Help generating additional observations in a timeseries data set

There must be some missing departure times in your data, perhaps for patients which are still hospitalized or patients that died? This should be investigated. In the meantime, one way to take care of this is to ignore patients with a missing arrival or departure time (I also added some comments to the code) :

/* Get the first and last hours present in the data */
proc sql noprint;
select
intnx("DTHOUR", min(arrive_dt), 0) format=15.0,
intnx("DTHOUR", max(depart_dt), 1) format=15.0
into :dtSt, :dtEnd
from sasforum.Array_sample_5_25_13;
quit;

/* Make a list of all hours between the first and last */
data hours;
dt = &dtSt.;
do until (dt >= &dtEnd.);
output;
dt = intnx("DTHOUR", dt, 1);
end;
format dt datetime16.;
run;

/* Create a time series of the changes in workload. Arivals increase the
workload, departures decrease it. Also intersperse the hours (with no
workload changes) in the time series. Sum all the changes that occured at
a given time stamp (e.g. two patients arrive at the same time) so that
each time stamp is unique in the time series */
proc sql;
create table times as
select
dt format=datetime13.,
sum(nsg_wk) as nsg_wk
from
(select arrive_dt as dt, nsg_wk from sasforum.Array_sample_5_25_13
where arrive_dt is not missing and depart_dt is not missing
union all corresponding
select depart_dt as dt, -nsg_wk as nsg_wk from sasforum.Array_sample_5_25_13
where arrive_dt is not missing and depart_dt is not missing
union all corresponding
select dt, 0 as nsg_wk from hours)
group by dt
order by dt;
quit;

/* Compute the cumulative sum of workload changes. Calculate durations
set times;
duration = dif(dt);
start = lag(dt);
if duration > 0 then output;
format start datetime16. duration time5.;
run;

/* Calculate the hourly average of workloads, weighted by their duration */
proc sql;
select
intnx("DTHOUR", start, 0) as dt format=datetime13.,
group by calculated dt;
quit;

PG

PG
PROC Star
Posts: 1,317

## Re: Help generating additional observations in a timeseries data set

Hi, @KevinArmstrong

I had been going to bang out some code for this using data steps, but since has stepped up with his SQL wizardry I may not have to!

Let us know if his code meets your needs (and you should award him a "right answer" if it does!)

Best,

Tom

Occasional Contributor
Posts: 9

## Re: Help generating additional observations in a timeseries data set

Thanks very much Tom and yes Pierre's answer was extremely helpful. My thanks to you as well for your willingness to help....what a tremendous resource this discussion board is!

I have two other SAS programming needs as part of my doctoral project, 1/ Linear Programming, and 2/ XML in order to export forecast workload results into scheduling software (Kronos specifically).

I am looking for assistance in both of these areas also so if you or Pierre are able and interested please let me know.

Kevin

PROC Star
Posts: 1,317

## Re: Help generating additional observations in a timeseries data set

I deal more with the data management end of SAS, but there are lots of people who can advise about the LP portion of your work.

I'll keep an eye out for your XML issues. I can pretty much guarantee it'll be dead simple in SAS.

Best,

Tom

Occasional Contributor
Posts: 9

## Re: Help generating additional observations in a timeseries data set

Tres bon Monsieur Gagnon!

Merci beaucoup pour votre aide.

I ran this code on a full set of data (I have patient level data for 4 different EDs each over a 6 year period) and with the exception of higher than expected workload at times everything looks good. The high workload may be accurate, I just need to look at it carefully.

This appears to be a very elegant solution to my challenge....very well done and many thanks.

I am a part-time doctoral student (full-time ED Director) and a new SAS user. My next step is to now take the workload data and run it through an ARIMA forecasting model someone at SAS has been helping me develop. Once workload per hour is forecast I would like to use Linear Programming in SAS to generate the optimum daily staff schedule by defining specific shift start times and durations. Do you have any suggestions on that?

I'm new to the SAS discussion board and don't want to offend you by offering this but I would be willing to compensate you for your time if you are interested.

Anyway thanks very much for your help with this Mr. Gagnon....I sincerely appreciate it.

Kevin

Posts: 5,543

## Re: Help generating additional observations in a timeseries data set

Hi Kevin,

I'm happy I could be of some help. And I also appreciate the social usefulness of your project. Taking a step back and considering your mention of ARIMA (which means you have access to SAS/ETS) makes me realize that a couple of steps in my little program could probably be replaced with a call to proc expand.  Oh well!

I am flattered by your offer, but I am fully employed and the other parts of your study would take me quite far from my comfort zone.

Wishing you the best with your project,

Au plaisir!

PG

PG
Occasional Contributor
Posts: 9

## Re: Help generating additional observations in a timeseries data set

Hi PG,

I understand...thank you very much once again for your help.

Cheers,

Kevin

🔒 This topic is solved and locked.

Discussion stats
• 14 replies
• 341 views
• 3 likes
• 3 in conversation