I have a data set that I would like to produce means, sum etc for a fixed interval of 15 minutes starting at at not so even time e.g. 10:02:53. So the file should be a sum or mean of the data between 10:02:53 and 10:17:53 and then again from 10:17:54 to 10:32:54 etc. I have tried with the round function but that does not work.
I need to merge two data sets and one has one data point per second and the other have a summary every 15 minutes but not starting on times that are easy to handle.
Anyone have a solution for this?
Looks to me like you will need something like this:
/* Create a datetime variable */
DT = dhms(date, hour(time), minute(time), second(time));
format DT datetime17.;
/* Calculate time interval */
timeInterval = floor((DT - '27FEB2015:10:05:48'dt) / '00:15:00't);
And then summarize your data by timeInterval.
PG
Hi,
Start by assigning each row to a group variable. You can then group by the group variable. E.g.
data have;
VALUE=123; TIME="10:02:53"t; output;
VALUE=123; TIME="10:10:53"t; output;
VALUE=123; TIME="10:17:53"t; output;
VALUE=123; TIME="10:20:53"t; output;
run;
data want;
set have;
retain first_time current_group;
if _n_=1 then do;
first_time=time;
current_group=1;
end;
else if time > first_time + "00:17:00"t then do;
first_time=time;
current_group=current_group+1;
end;
format first_time time time7.;
run;
Since your problem involves a data merge you should provide a few lines of example data from each set and what you expect the resulting merge to look like.
If the issue involves getting items into actual time order then I would check to make sure you are using DATETIME variables if the data is longer than a given day. Then merging or set and sort would put things in the correct order.
The data looks like this. I need to be able to summarize the data 'mot', 'stan', 'lie' and 'step' within 15 min interval starting at 10:05:48, so when summarizing the data I need to be able to specify when the first 15 min period should start.
Date Mot stan Lie Step Time Cow
27/02/2015 0 100 0 0 10:01:18 4009
27/02/2015 0 100 0 0 10:01:19 4009
27/02/2015 0 100 0 0 10:01:20 4009
27/02/2015 0 100 0 0 10:01:21 4009
27/02/2015 0 100 0 0 10:01:22 4009
27/02/2015 0 100 0 0 10:01:23 4009
27/02/2015 0 100 0 0 10:01:24 4009
27/02/2015 0 100 0 0 10:01:25 4009
27/02/2015 0 100 0 0 10:01:26 4009
27/02/2015 0 100 0 0 10:01:27 4009
27/02/2015 0 100 0 0 10:01:28 4009
27/02/2015 0 100 0 0 10:01:29 4009
27/02/2015 0 100 0 0 10:01:30 4009
27/02/2015 0 100 0 0 10:01:31 4009
I want something like this:
datetime | cow | _TYPE_ | _FREQ_ | mmotion | mstanding | mlying | msteps |
27FEB15:11:16:23 | 4009 | 0 | 900 | 0 | 0 | 100 | 0 |
27FEB15:11:31:23 | 4009 | 0 | 900 | 0 | 0 | 100 | 0 |
Does this make sense?
Looks to me like you will need something like this:
/* Create a datetime variable */
DT = dhms(date, hour(time), minute(time), second(time));
format DT datetime17.;
/* Calculate time interval */
timeInterval = floor((DT - '27FEB2015:10:05:48'dt) / '00:15:00't);
And then summarize your data by timeInterval.
PG
This works perfectly, Thanks for your help and thanks to all the participants in this discussion. Your input has been really valuable for me.
You can use sub-query of SQL to get it as well.
select *,(select sum(mot) from have where a.Time <= Time <= a.Time+15*60) as mot
from have as a
You could do something with a format to assign the intervals based on datetime ranges, such as the following:
data intvls;
format datetime datetime.;
do datetime = '27FEB15:11:16:23'dt to '28FEB15:11:16:23'dt by 900;
start = datetime;
end = intnx('second', start, 899);
output;
end;
run;
data formats;
set intvls;
label = _n_;
fmtname = 'cstmInt';
type = 'I';
run;
proc format cntlin = formats;
run;
data have;
do datetime = '27FEB15:11:16:23'dt to '28FEB15:11:16:23'dt;
output;
end;
run;
data want;
format datetime datetime.;
set have;
interval = input(datetime, cstmInt.);
nmotion = rannor(5);
run;
proc sql;
SELECT interval 'Interval',
min(datetime) AS intStart format datetime. 'First Time',
max(datetime) AS intEnd format datetime. 'Last Time',
count(*) 'Count',
mean(nmotion) 'Mean nmotion'
FROM want
GROUP BY interval;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.