## Running sum across observations for a specific time period

Occasional Contributor
Posts: 8

# Running sum across observations for a specific time period

Hi,

I have a longitudinal dataset and would like to sum up a field (drug dose) for 2 and 4 hour intervals.  Essentially I want SAS to create logic to look back over the past 2 and 4 hour period and calculate a total.  I was able to set up the following code to add up doses for 2 and 4 hour intervals, however, this only calculates a total over the 2 or 4 hour period and then resets it (not a running sum.

Thanks,  JH

data temp2;

set temp;

by episode_id datetime;

retain sum_dose_2hr sum_dose_4hr begin_dose_dt_2hr begin_dose_dt_4hr;

format begin_dose_dt_2hr datetime20.

begin_dose_dt_4hr datetime20.;

if first.episode_id then do;

begin_dose_dt_2hr=datetime; begin_dose_dt_4hr=datetime;

sum_dose_2hr=0; sum_dose_4hr=0;

end;

if ((datetime-begin_dose_dt_2hr)/3600)>2 then do;

begin_dose_dt_2hr=datetime; sum_dose_2hr=0;

end;

if ((datetime-begin_dose_dt_4hr)/3600)>4 then do;

begin_dose_dt_4hr=datetime; sum_dose_4hr=0;

end;

sum_dose_2hr=sum(sum_dose_2hr,IV_SOLUTIONS_drug_dose);

sum_dose_4hr=sum(sum_dose_4hr,IV_SOLUTIONS_drug_dose);

run;

Super Contributor
Posts: 543

## Re: Running sum across observations for a specific time period

Could you provide a small sample (fake) data?

Anca.

Occasional Contributor
Posts: 8

## Re: Running sum across observations for a specific time period

Sure.  My data is set similar to the following

episode_id          datetime                 drug_dose

1              01jan2013:12:00               10

1              01jan2013:01:00               10

1              01jan2013:02:00               10

1              01jan2013:02:30               10

1              01jan2013:04:30               10

Here is how I want my data to look

episode_id          datetime                 drug_dose          sum_drug_dose_2hr          sum_drug_dose_4hr

1              01jan2013:12:00               10                            10                                     10

1              01jan2013:01:00               10                            20                                     20

1              01jan2013:02:00               10                            30                                     30

1              01jan2013:02:30               10                            30                                     40

1              01jan2013:04:30               10                            20                                     40

Super Contributor
Posts: 543

## Re: Running sum across observations for a specific time period

Oi, I solved this in a non-elegant way.

I assume you need another flag to get the logic that you desire:

(The bold part in the code are my additions)

data temp2;

set temp;

by episode_id datetime;

retain sum_dose_2hr sum_dose_4hr begin_dose_dt_2hr begin_dose_dt_4hr;

format begin_dose_dt_2hr datetime20.

begin_dose_dt_4hr datetime20.;

if first.episode_id then do;

begin_dose_dt_2hr=datetime; begin_dose_dt_4hr=datetime;

sum_dose_2hr=0; sum_dose_4hr=0;

end;

if ((datetime-begin_dose_dt_2hr)/3600)>2 then do;

temp_2hr = sum_dose_2hr;begin_dose_dt_2hr=datetime; sum_dose_2hr=0;

end;

if ((datetime-begin_dose_dt_4hr)/3600)>4 then do;

temp_4hr = sum_dose_4hr;begin_dose_dt_4hr=datetime; sum_dose_4hr=0;

end;

sum_dose_2hr=sum(sum_dose_2hr,IV_SOLUTIONS_drug_dose);

sum_dose_4hr=sum(sum_dose_4hr,IV_SOLUTIONS_drug_dose);

run;

data step;

set temp2;

if temp_2hr ne . then sum_dose_2hr = temp_2hr;else sum_dose_2hr = sum_dose_2hr;

if temp_4hr ne . then sum_dose_4hr = temp_4hr;else sum_dose_4hr = sum_dose_4hr;

drop temp_:;

run;

proc print;run;

Good luck!

Nice problem.

Anca.

PS: there are more elegant/efficient ways to solve this, maybe someone will provide a better solution

Occasional Contributor
Posts: 8

## Re: Running sum across observations for a specific time period

Unfortunately your code does not seem to work.  The temp variables overwrite the sum_dose fields whenever the 2 hour or 4 hour times are exceeded.

I was able to have a colleague help me solve this problem using proc expand to create dummy rows at the minute level; then calculate running sums.

JH

Super Contributor
Posts: 543

## Re: Running sum across observations for a specific time period

I apologize if I didn't quite get it.

I thought that you wanted the sum to be carried over

 1   01JAN13:00:00 1 10 10 10 . . 10 10 2   01JAN13:01:00 1 10 20 20 . . 20 20 3   01JAN13:02:00 1 10 30 30 . . 30 30 4   01JAN13:02:30 1 10 10 40 1 . 30------ --------->>>this is more than 2 hrs, carry down the sum from above, while you 'reset' the sum and 5   01JAN13:04:30 1 10 20 10 . 1 20------ ---------->>> so this sum now it's 20 (10 + 10) 6   01JAN13:00:00 2 10 10 10 . . 10 10 7   01JAN13:01:00 2 10 20 20 . . 20 20 8   01JAN13:02:00 2 10 30 30 . . 30 30 9   01JAN13:02:30 2 10 10 40 1 . 30 40 10   01JAN13:04:30 2 10 20 10 . 1 20 40 ------>>> similarly, this is more than 4 hrs...so carry down the sum....

I do have a proc sort by id and datetime;

Given this dataset, what does your code result in?

data temp;

format datetime datetime15.;

input episode_id          datetime datetime15.                 iv_solutions_drug_dose;

datalines;

1 01jan2013:00:00 10

1 01jan2013:01:00 10

1 01jan2013:02:00 10

1 01jan2013:02:30 10

1 01jan2013:04:30 10

2 01jan2013:00:00 10

2 01jan2013:01:00 10

2 01jan2013:02:00 10

2 01jan2013:02:30 10

2 01jan2013:04:30 10

;

I am curious, that's all.

Thanks.

Anca.

Occasional Contributor
Posts: 8

## Re: Running sum across observations for a specific time period

Sorry if my post was not clear.  I was trying to calculate a running sum of the dose for the previous 2 and 4 hours, not carry down the sum.

I can post the results of my code later, but I need to test it out first.

Thanks,

JH

Posts: 5,535

## Re: Running sum across observations for a specific time period

Are those datetimes from the second one on supposed to be PM?

PG
Occasional Contributor
Posts: 8

## Re: Running sum across observations for a specific time period

All of the datetimes are military times (up to 24 hrs) in SAS, no AM or PM is denoted.

Posts: 5,535

## Re: Running sum across observations for a specific time period

The datetimes in your example are not sorted then. Why sum_drug_dose_2hr=20 at 01jan2013:01:00? There is no datetime preceding it in your data. Anca's code assumes your data is sorted.

PG
Occasional Contributor
Posts: 8

## Re: Running sum across observations for a specific time period

You are correct.  I made up the data for the example.  The 12:00 time should have read 00:00.

Super Contributor
Posts: 543

## Re: Running sum across observations for a specific time period

And I omitted to mention that I  changed the dates to be 13,14,14.30,and 16.30.

Sorry.

:-)

Posts: 5,535

## Re: Running sum across observations for a specific time period

You may find SQL to be a slightly simpler solution:

proc sql;

create table want as

select a.*, sum_drug_dose_4h from

(select

h0.*,

sum(h2.drug_dose) as sum_drug_dose_2h

from

temp as h0 inner join

temp as h2 on h0.episode_id=h2.episode_id and

h2.datetime between intnx("HOUR", h0.datetime, -2, "SAME") and h0.datetime

group by h0.episode_id, h0.datetime, h0.drug_dose) as a

natural join

(select

h0.episode_id,

h0.datetime,

sum(h4.drug_dose) as sum_drug_dose_4h

from

temp as h0 inner join

temp as h4 on h0.episode_id=h4.episode_id and

h4.datetime between intnx("HOUR", h0.datetime, -4, "SAME") and h0.datetime

group by h0.episode_id, h0.datetime);

select * from want;

quit;

I did say slightly?

PG

PG
Super User
Posts: 6,781

## Re: Running sum across observations for a specific time period

It looks like you have a relatively limited number of records to sum.  If you were to store the most recent 10 doses, for example, that might be sufficient to calculate a total for the most recent 4 hours.  In that case, a set of two arrays (one with doses, one with datetimes)  would probably be the simplest solution.

If you can confirm that 10 doses would always be enough for a 4-hour time period (or if not, post an alternative maximum number of doses needed), I would be able to write the code later tonight.

Good luck.

Occasional Contributor
Posts: 8

## Re: Running sum across observations for a specific time period

I appreciate all the help everyone!

JH

Discussion stats
• 16 replies
• 561 views
• 0 likes
• 4 in conversation