SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Creating Maximum of the Sum of Rows for a specified Time Interval

Reply
Contributor
Posts: 58

Creating Maximum of the Sum of Rows for a specified Time Interval

Dear SAS Community,

 
I am trying to create a summary dataset from intraday data to output the Maximum of the Sum of Rows for a specified Time Interval. The dataset I have and the dataset I want are shown below. In short, I am trying to create a new dataset that gives me a summary from the intraday data which show the maximum number of steps within any 5 minute interval. In this specific example, the interval is between 0:16:00.000 to 0:20:00.000 and the maximum is (23+0+21+0+0)=44 steps.  
 
Any help is much appreciated.
 
Best,
 
Pronabesh
 
HAVE
 
date time steps
01JAN2015 0:00:00.000 3
01JAN2015 0:01:00.000 7
01JAN2015 0:02:00.000 9
01JAN2015 0:03:00.000 0
01JAN2015 0:04:00.000 11
01JAN2015 0:05:00.000 0
01JAN2015 0:06:00.000 0
01JAN2015 0:07:00.000 0
01JAN2015 0:08:00.000 0
01JAN2015 0:09:00.000 0
01JAN2015 0:10:00.000 0
01JAN2015 0:11:00.000 0
01JAN2015 0:12:00.000 0
01JAN2015 0:13:00.000 0
01JAN2015 0:14:00.000 0
01JAN2015 0:15:00.000 0
01JAN2015 0:16:00.000 23
01JAN2015 0:17:00.000 0
01JAN2015 0:18:00.000 21
01JAN2015 0:19:00.000 0
01JAN2015 0:20:00.000 0
WANT
date maximum_5 interval 
01JAN2015 44 5
Super User
Posts: 11,343

Re: Creating Maximum of the Sum of Rows for a specified Time Interval

Posted in reply to pronabesh

If your data is ALWAYS at 1 minute intervals something like this provides the total steps you are asking for. However it this may not be a general solution if your data crosses date boundaries.

This assumes your data looks exactly as posted. If you need this by date for a longer period of time there could well be other considerations.

 

data temp;
   set have;
   TotalSteps = sum(lag1(steps),lag2(steps),lag3(steps),lag4(steps),lag5(steps));
run;

proc summary data=temp nway;
   class date;
   var TotalSteps;
   output out=want (drop= _type_ _freq_) max=maximum_5;
run;

If your data is not at minute intervals then this does not work. Also if you are looking at doing multiple intervals then it does not extend well.

 

Super User
Posts: 5,429

Re: Creating Maximum of the Sum of Rows for a specified Time Interval

Posted in reply to pronabesh
Using the word Maximum confuses, it seems like a straight Sum to me.
Another option is to use a Time Dimension, in which you have your intervals defined. Then do a SQL join with filter and group by date and potentially interval id of you wish to report on multiple intervals in the same query.
Data never sleeps
Ask a Question
Discussion stats
  • 2 replies
  • 247 views
  • 0 likes
  • 3 in conversation