Rolling cumulative sum in time series data

Reply
Occasional Contributor
Posts: 15

Rolling cumulative sum in time series data

Hello everyone,

Given a data set:

ID      Time          Flag

1      8:45:01          0

2      8:45:04          1

3      8:45:06          1

4      8:45:08          0

5      8:45:09          1

6      8:45:11          0

7      8:45:13          0

...

For each record, I would like to count how many flag=1 in previous 5 seconds.

For example:

For id3, previous 5 seconds is 8:45:01 - 8:45:05, including id1 and id2. Number of flag = 1 is 1 (occured at id2 8:45:04).

For id6, previous 5 seconds is 8:45:06 - 8:45:10, including id3, id4, and id5. Number of flag = 1 is  2 (occured at id3 and id5).

In other words, the final data set should look like this:

ID      Time          Flag    Number

1      8:45:01          0           0

2      8:45:04          1           0

3      8:45:06          1           1

4      8:45:08          0           2

5      8:45:09          1           2

6      8:45:11          0           2

7      8:45:13          0           1

...

Number of included records is not always the same in every reocrd's 5 seconds interval. How can I create the final table? Thank you for your assistance.

Best regards,

Tammy

Valued Guide
Posts: 2,174

Re: Rolling cumulative sum in time series data

Proc expand might fill the range

Occasional Contributor
Posts: 15

Re: Rolling cumulative sum in time series data

There are missing values in the between of records based on second time interval. I like to keep missing values as missing or zero.

However, I can only find METHOD=STEP to interploate missing values.

Is there any other option to deal with missing values?

Help please!!!!

Respected Advisor
Posts: 3,124

Re: Rolling cumulative sum in time series data

I am sure with proc expand you are able to do this, but why not a simple proc sql?

data have;

input ID :$      Time :time8.          Flag;

FORMAT TIME TIME8.;

cards;

1      8:45:01          0

2      8:45:04          1

3      8:45:06          1

4      8:45:08          0

5      8:45:09          1

6      8:45:11          0

7      8:45:13          0

;

/*PROC SQL*/

PROC SQL;

  CREATE TABLE WANT AS

    SELECT *, (SELECT COALESCE(SUM(FLAG=1),0) FROM HAVE WHERE TIME BETWEEN A.TIME-5 AND A.TIME-1) AS NUMBER

        FROM HAVE A;

QUIT;

Haikuo

Occasional Contributor
Posts: 15

Re: Rolling cumulative sum in time series data

Thank you, Hai.kuo. This is exactly what I am looking for. Simple and pretty. Smiley Happy

Occasional Contributor
Posts: 15

Re: Rolling cumulative sum in time series data

To extend the question, what if there are multiple IDs in the data and treat each ID independently? Thanks!!

For example:

ID      Time          Flag

A      8:45:01          0

A      8:45:04          1

A      8:45:06          1

A      8:45:08          0

A      8:45:09          1

B      8:45:11          0

B      8:45:13          0

B      8:45:01          0

C      8:45:04          1

C      8:45:06          1

C      8:45:08          0

C      8:45:09          1

C      8:45:11          0

....

Best,

Tammy

Ask a Question
Discussion stats
  • 5 replies
  • 491 views
  • 0 likes
  • 3 in conversation