Rolling cumulative sum in time series data

Reply
Occasional Contributor
Posts: 15

Rolling cumulative sum in time series data

Hello everyone,

Given a data set:

ID      Time          Flag

1      8:45:01          0

2      8:45:04          1

3      8:45:06          1

4      8:45:08          0

5      8:45:09          1

6      8:45:11          0

7      8:45:13          0

...

For each record, I would like to count how many flag=1 in previous 5 seconds.

For example:

For id3, previous 5 seconds is 8:45:01 - 8:45:05, including id1 and id2. Number of flag = 1 is 1 (occured at id2 8:45:04).

For id6, previous 5 seconds is 8:45:06 - 8:45:10, including id3, id4, and id5. Number of flag = 1 is  2 (occured at id3 and id5).

In other words, the final data set should look like this:

ID      Time          Flag    Number

1      8:45:01          0           0

2      8:45:04          1           0

3      8:45:06          1           1

4      8:45:08          0           2

5      8:45:09          1           2

6      8:45:11          0           2

7      8:45:13          0           1

...

Number of included records is not always the same in every reocrd's 5 seconds interval. How can I create the final table? Thank you for your assistance.

Best regards,

Tammy

Valued Guide
Posts: 2,177

Re: Rolling cumulative sum in time series data

Posted in reply to TammyGoSAS

Proc expand might fill the range

Occasional Contributor
Posts: 15

Re: Rolling cumulative sum in time series data

There are missing values in the between of records based on second time interval. I like to keep missing values as missing or zero.

However, I can only find METHOD=STEP to interploate missing values.

Is there any other option to deal with missing values?

Help please!!!!

Respected Advisor
Posts: 3,156

Re: Rolling cumulative sum in time series data

Posted in reply to TammyGoSAS

I am sure with proc expand you are able to do this, but why not a simple proc sql?

data have;

input ID :$      Time :time8.          Flag;

FORMAT TIME TIME8.;

cards;

1      8:45:01          0

2      8:45:04          1

3      8:45:06          1

4      8:45:08          0

5      8:45:09          1

6      8:45:11          0

7      8:45:13          0

;

/*PROC SQL*/

PROC SQL;

  CREATE TABLE WANT AS

    SELECT *, (SELECT COALESCE(SUM(FLAG=1),0) FROM HAVE WHERE TIME BETWEEN A.TIME-5 AND A.TIME-1) AS NUMBER

        FROM HAVE A;

QUIT;

Haikuo

Occasional Contributor
Posts: 15

Re: Rolling cumulative sum in time series data

Thank you, Hai.kuo. This is exactly what I am looking for. Simple and pretty. Smiley Happy

Occasional Contributor
Posts: 15

Re: Rolling cumulative sum in time series data

Posted in reply to TammyGoSAS

To extend the question, what if there are multiple IDs in the data and treat each ID independently? Thanks!!

For example:

ID      Time          Flag

A      8:45:01          0

A      8:45:04          1

A      8:45:06          1

A      8:45:08          0

A      8:45:09          1

B      8:45:11          0

B      8:45:13          0

B      8:45:01          0

C      8:45:04          1

C      8:45:06          1

C      8:45:08          0

C      8:45:09          1

C      8:45:11          0

....

Best,

Tammy

Ask a Question
Discussion stats
  • 5 replies
  • 510 views
  • 0 likes
  • 3 in conversation