turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Help generating additional observations in a times...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2013 02:40 PM

Hello,

I'm hoping someone can help me with this problem. I have time series data for emergency department visits which for each visit includes the date and time of the patients arrival, the date and time of the patients departure, and the patients acuity. What I am attempting to do is perform forecasting analysis based not only on the hour in which the patient arrived but each subsequent hour that the patient remained in the emergency department. In other words I want to calculate the staff workload for each hour that a given patient is in the ED.

I've attempted to use arrays but have failed.

Any suggestions?

Thanks in advance.

Kevin Armstrong

Accepted Solutions

Solution

05-26-2013
01:37 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-26-2013 01:37 PM

There must be some missing departure times in your data, perhaps for patients which are still hospitalized or patients that died? This should be investigated. In the meantime, one way to take care of this is to ignore patients with a missing arrival or departure time (I also added some comments to the code) :

**/* Get the first and last hours present in the data */****proc sql noprint;****select **** intnx("DTHOUR", min(arrive_dt), 0) format=15.0, **** intnx("DTHOUR", max(depart_dt), 1) format=15.0****into :dtSt, :dtEnd****from sasforum.Array_sample_5_25_13;****quit;**

**/* Make a list of all hours between the first and last */****data hours;****dt = &dtSt.;****do until (dt >= &dtEnd.);**** output;**** dt = intnx("DTHOUR", dt, 1);**** end;****format dt datetime16.;****run;**

**/* Create a time series of the changes in workload. Arivals increase the ****workload, departures decrease it. Also intersperse the hours (with no****workload changes) in the time series. Sum all the changes that occured at ****a given time stamp (e.g. two patients arrive at the same time) so that ****each time stamp is unique in the time series */****proc sql;****create table times as****select **** dt format=datetime13., **** sum(nsg_wk) as nsg_wk****from ****(select arrive_dt as dt, nsg_wk from sasforum.Array_sample_5_25_13**** where arrive_dt is not missing and depart_dt is not missing**** union all corresponding**** select depart_dt as dt, -nsg_wk as nsg_wk from sasforum.Array_sample_5_25_13**** where arrive_dt is not missing and depart_dt is not missing**** union all corresponding**** select dt, 0 as nsg_wk from hours)****group by dt****order by dt;****quit;**

**/* Compute the cumulative sum of workload changes. Calculate durations****between consecutive workload changes */****data workLoads(keep=start duration workload);****set times;****duration = dif(dt);****start = lag(dt);****if duration > 0 then output;****workLoad + nsg_wk;****format start datetime16. duration time5.;****run;**

**/* Calculate the hourly average of workloads, weighted by their duration */****proc sql;****create table hourlyWorkloads as****select **** intnx("DTHOUR", start, 0) as dt format=datetime13.,**** sum(duration*workload)/sum(duration) as mean_workload****from workloads****group by calculated dt;****quit;**

PG

PG

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2013 03:16 PM

I think we need a little more information. Could you give an example of what your input data looks like, what you'd like as a result, and some details about how to calculate the result.

Tom

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2013 03:58 PM

Thanks for the reply Tom.

Attached is a sample of the data I'm working with. The 'los' variable represents the number of minutes that the patient spent in the ED. For the first hour, and then each subsequent hour. I would like to generate workload ( nsg_wk) for each visit but that workload needs to be associated to the correct datetime. What I need is a set of timeseries data that is based on the workload each our (nsg_wk specifically). The minutes of nsg_wk needed each hour for each visit is based on the acuity level as well as the duration of the visit.

Below is my feeble attempt at working with an array to accomplish this but it obviously doesn't work. I stopped at 8 hours but at times the length of stay is longer than that.

I'm thinking that what is needed is to generate additional observations for each subsequent hour but I don't know how to do that.

Thanks for being willing to look at this.

Kevin

data nsgtime;

set array_sample;

format hour1-hour8 datetime24.;

/*array nsg_min {8,25} ;

format hour1-hour8 datetime24.;*/

do hour1= arrive_dt;

if los > 60 <120 then hour2= arrive_dt + 3600;

else hour2= .;

if los >120 <180 then hour3= arrive_dt + 7200;

else hour3= .;

if los >180 <240 then hour4= arrive_dt + 10800;

else hour4= .;

if los >240 <300 then hour5= arrive_dt + 14400;

else hour5= .;

if los >300 <360 then hour6= arrive_dt + 18000;

else hour6= .;

if los >360 <420 then hour7= arrive_dt + 21600;

else hour7= .;

if los >420 <480 then hour8= arrive_dt + 25200;

end;

array nsg_min {8,25} hour1-hour8 obs1-obs25;

if acuity= 1 then nmin= 15;

if acuity= 2 then nmin= 15;

if acuity= 3 and hour1 then nmin= 30; else nmin= 15;

if acuity= 4 and hour1 then nmin= 45; else nmin= 30;

if acuity= 5 and hour1 then nmin= 60; else nmin= 60;

if acuity=6 and hour1 then nmin= 120; else nmin= 60;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2013 05:40 PM

So taking your twelfth observation as an example:

- arrival time is 2:04:00, departure time is 9:09:00, acuity is 6

- So the duration of the stay is 7 hours, five minutes, or 425 minutes.

For this visit, you want to set nsg_wk to 120 for the first hour, and then to 60 for each of hours 2 through 8.

From what you're saying, I'm kind of imagining the following output. I'm creating one record per hour for each visit, so I've tagged each observation with an "event" variable to relate them.

Does this look like what you're after?

event arrive_dt hour nsg_wk

1234 02:04 1 120

1234 03:04 2 60

1234 04:04 3 60

1234 05:04 4 60

1234 06:04 5 60

1234 07:04 6 60

1234 08:04 7 60

1234 09:04 8 60

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2013 06:02 PM

Wow great work Tom!

Assuming that arrive_dt in your sample will include the date then yes that is exactly what I'm looking for. I should then be able to sort by arrive_dt then perform proc timeseries on nsg_wk.

Are you a SAS employee or just a good samaritan?

Thanks very much,

Kevin

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-02-2013 07:20 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2013 10:06 PM

Here is how to get the complete list of hourly workloads from your data. What's missing is the initial workload level, here it is assumed to be zero. The calculated values are average workloads, weighted by their duration. The time values are the beginning of hourly periods.

**proc sql noprint;****select **** intnx("DTHOUR", min(arrive_dt), 0) format=15.0, **** intnx("DTHOUR", max(depart_dt), 1) format=15.0****into :dtSt, :dtEnd****from sasforum.Array_sample_5_25_13;****quit;**

** **

**data hours;****dt = &dtSt.;****do until (dt >= &dtEnd.);**** output;**** dt = intnx("DTHOUR", dt, 1);**** end;****format dt datetime16.;****run;**

** **

**proc sql;****create table times as****select **** dt format=datetime13., **** sum(nsg_wk) as nsg_wk****from ****(select arrive_dt as dt, nsg_wk from sasforum.Array_sample_5_25_13**** union all corresponding**** select depart_dt as dt, -nsg_wk as nsg_wk from sasforum.Array_sample_5_25_13**** union all corresponding**** select dt, 0 as nsg_wk from hours)****group by dt****order by dt;****quit;**

** **

**data workLoads(keep=start duration workload);****set times;****duration = dif(dt);****start = lag(dt);****if duration > 0 then output;****workLoad + nsg_wk;****format start datetime16. duration time5.;****run;**

** **

**proc sql;****create table hourlyWorkloads as****select **** intnx("DTHOUR", start, 0) as dt format=datetime13.,**** sum(duration*workload)/sum(duration) as mean_workload****from workloads****group by calculated dt;****quit;**

** **

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-26-2013 11:43 AM

Hello PGStats,

Thanks very much for your help. The code you provided runs well on the sample data but when I attempt to run it on a full data set it generates large, negative values for workLoad. I'm a beginner SAS programmer and have had difficulty finding the cause of the problem. I eventually found that the first observation in the TIMES data set has a dt value of '.' and a workLoad value of '-12285'. I've attached the TIMES data set so you can reviewed if interested. Any thoughts on what is causing this error? I checked the full data set and it appears to me to be in the same format as the sample data so I'm perplexed.

I'll keep trying but perhaps the answer will jump out at you?

Thanks once again for your help.

Kevin

Solution

05-26-2013
01:37 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-26-2013 01:37 PM

There must be some missing departure times in your data, perhaps for patients which are still hospitalized or patients that died? This should be investigated. In the meantime, one way to take care of this is to ignore patients with a missing arrival or departure time (I also added some comments to the code) :

**/* Get the first and last hours present in the data */****proc sql noprint;****select **** intnx("DTHOUR", min(arrive_dt), 0) format=15.0, **** intnx("DTHOUR", max(depart_dt), 1) format=15.0****into :dtSt, :dtEnd****from sasforum.Array_sample_5_25_13;****quit;**

**/* Make a list of all hours between the first and last */****data hours;****dt = &dtSt.;****do until (dt >= &dtEnd.);**** output;**** dt = intnx("DTHOUR", dt, 1);**** end;****format dt datetime16.;****run;**

**/* Create a time series of the changes in workload. Arivals increase the ****workload, departures decrease it. Also intersperse the hours (with no****workload changes) in the time series. Sum all the changes that occured at ****a given time stamp (e.g. two patients arrive at the same time) so that ****each time stamp is unique in the time series */****proc sql;****create table times as****select **** dt format=datetime13., **** sum(nsg_wk) as nsg_wk****from ****(select arrive_dt as dt, nsg_wk from sasforum.Array_sample_5_25_13**** where arrive_dt is not missing and depart_dt is not missing**** union all corresponding**** select depart_dt as dt, -nsg_wk as nsg_wk from sasforum.Array_sample_5_25_13**** where arrive_dt is not missing and depart_dt is not missing**** union all corresponding**** select dt, 0 as nsg_wk from hours)****group by dt****order by dt;****quit;**

**/* Compute the cumulative sum of workload changes. Calculate durations****between consecutive workload changes */****data workLoads(keep=start duration workload);****set times;****duration = dif(dt);****start = lag(dt);****if duration > 0 then output;****workLoad + nsg_wk;****format start datetime16. duration time5.;****run;**

**/* Calculate the hourly average of workloads, weighted by their duration */****proc sql;****create table hourlyWorkloads as****select **** intnx("DTHOUR", start, 0) as dt format=datetime13.,**** sum(duration*workload)/sum(duration) as mean_workload****from workloads****group by calculated dt;****quit;**

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-26-2013 05:16 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-27-2013 11:32 AM

Thanks very much Tom and yes Pierre's answer was extremely helpful. My thanks to you as well for your willingness to help....what a tremendous resource this discussion board is!

I have two other SAS programming needs as part of my doctoral project, 1/ Linear Programming, and 2/ XML in order to export forecast workload results into scheduling software (Kronos specifically).

I am looking for assistance in both of these areas also so if you or Pierre are able and interested please let me know.

Thanks once again Tom....please enjoy your day.

Kevin

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-27-2013 03:56 PM

You're well on your way!

I deal more with the data management end of SAS, but there are lots of people who can advise about the LP portion of your work.

I'll keep an eye out for your XML issues. I can pretty much guarantee it'll be dead simple in SAS.

Best,

Tom

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-27-2013 11:21 AM

Tres bon Monsieur Gagnon!

Merci beaucoup pour votre aide.

I ran this code on a full set of data (I have patient level data for 4 different EDs each over a 6 year period) and with the exception of higher than expected workload at times everything looks good. The high workload may be accurate, I just need to look at it carefully.

This appears to be a very elegant solution to my challenge....very well done and many thanks.

I am a part-time doctoral student (full-time ED Director) and a new SAS user. My next step is to now take the workload data and run it through an ARIMA forecasting model someone at SAS has been helping me develop. Once workload per hour is forecast I would like to use Linear Programming in SAS to generate the optimum daily staff schedule by defining specific shift start times and durations. Do you have any suggestions on that?

I'm new to the SAS discussion board and don't want to offend you by offering this but I would be willing to compensate you for your time if you are interested.

Anyway thanks very much for your help with this Mr. Gagnon....I sincerely appreciate it.

Kevin

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-27-2013 08:55 PM

Hi Kevin,

I'm happy I could be of some help. And I also appreciate the social usefulness of your project. Taking a step back and considering your mention of ARIMA (which means you have access to SAS/ETS) makes me realize that a couple of steps in my little program could probably be replaced with a call to proc expand. Oh well!

I am flattered by your offer, but I am fully employed and the other parts of your study would take me quite far from my comfort zone.

Wishing you the best with your project,

Au plaisir!

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-27-2013 09:39 PM

Hi PG,

I understand...thank you very much once again for your help.

Cheers,

Kevin