I have a set a of observations made once per minute over many months.
My task is to generate a report, showing the peak and the peak 5 minute average, with the day and time of that occurrence for each month. For example, if this was my data,
The peak would be 5 occurring on 12APR15:15:08
The peak 5 minute average would be 3.8 for the period 12APR15:15:06 to 12APR15:15:10
TIMESTAMP MEASUREMENT
12APR15:15:01 2
12APR15:15:02 2
12APR15:15:03 2
12APR15:15:04 3
12APR15:15:05 3
12APR15:15:06 3
12APR15:15:07 4
12APR15:15:08 5
12APR15:15:09 4
12APR15:15:10 3
12APR15:15:11 2
12APR15:15:12 2
12APR15:15:13 2
Is this what you need?
data have;
input TIMESTAMP :datetime20. MEASUREMENT;
format timestamp datetime20.;
cards;
12APR15:15:01 2
12APR15:15:02 2
12APR15:15:03 2
12APR15:15:04 3
12APR15:15:05 3
12APR15:15:06 3
12APR15:15:07 4
12APR15:15:08 5
12APR15:15:09 4
12APR15:15:10 3
12APR15:15:11 2
12APR15:15:12 2
12APR15:15:13 2
;
proc expand data=have out=have1;
convert measurement=ave /transformout=(CMOVAVE 5);
run;
proc sql;
select *, put(datepart(timestamp),monyy7.) as month from have1
group by month
having measurement=max(measurement)
;
quit;
Haikuo
Yes, that looks like what I need. I get the result:
month | peak | TIMESTAMP | ave |
------------------------------------------------------------------------------------------------------------------------------------
20154 | 5 | 12APR2015:15:08:00 | 3.8 |
Is there a way to print the month as "APR", as I have a years worth of data.
This should do:
proc sql;
select *, put(datepart(timestamp),monyy7.) as month from have1
group by month
having measurement=max(measurement)
;
quit;
Haikuo
If you don't have SAS/ETS, here is a data step way of doing it:
data have;
input TIMESTAMP :datetime20. MEASUREMENT;
format timestamp datetime20.;
cards;
12APR15:15:01 2
12APR15:15:02 2
12APR15:15:03 2
12APR15:15:04 3
12APR15:15:05 3
12APR15:15:06 3
12APR15:15:07 4
12APR15:15:08 5
12APR15:15:09 4
12APR15:15:10 3
12APR15:15:11 2
12APR15:15:12 2
12APR15:15:13 2
;
data have1;
set have;
month=put(datepart(timestamp),monyy7.);
run;
proc sort data=have1 out=have;
by month timestamp;
run;
data want (drop=_: measurement);
do _n_=1 by 1 until (last.month);
set have;
by month;
peak=max(peak,measurement);
end;
do _n_=1 by 1 until (last.month);
set have;
by month;
if measurement=peak then _n=_n_;
end;
do _n_=1 by 1 until (last.month);
set have;
by month;
if _n_=_n-2 then _sum=0;
_sum+measurement;
if _n_=_n+2 then do;
ave=_sum/5;
output;
end;
end;
run;
proc print;run;
Regards,
Haikuo
Another method. This will take the next 5 minutes for each minute (until no more spans of 5 timestamps are available) and calculate the average, track the max, etc... This is slightly different than what I think your approach was stated to be so be aware of the difference.
data foo;
format timestamp datetime.;
input timestamp datetime14. measurement;
cards;
12APR15:15:01 2
12APR15:15:02 2
12APR15:15:03 2
12APR15:15:04 3
12APR15:15:05 3
12APR15:15:06 3
12APR15:15:07 4
12APR15:15:08 5
12APR15:15:09 4
12APR15:15:10 3
12APR15:15:11 2
12APR15:15:12 2
12APR15:15:13 2
;
run;
data bar;
length span _span $54;
retain _span _timestamp max_measurement max_span;
tot=0;
do i=1 to 5 while( (n+i)<nobs );
p=n+i;
set foo point=p nobs=nobs;
_span=catx(' ',_span,timestamp);
tot+measurement;
if measurement>max_measurement then do;
max_measurement=measurement;
_timestamp=timestamp;
end;
end;
avg=tot/5;
if avg>max_span then do;
max_span=avg;
span=_span;
end;
n+1;
if n+5>nobs then do;
_span1=input(scan(_span,1),10.);
_span5=input(scan(_span,5),10.);
put 'PEAK at ' _timestamp datetime. ' with value of' max_measurement 2. /
'PEAK SPAN between ' _span1 datetime. ' and ' _span5 datetime. ' with value of' max_span 4.1 ;
output;
keep _span1 _span5 max_span _timestamp max_measurement;
stop;
end;
run;
PEAK at 12APR15:15:08:00 with value of 5
PEAK SPAN between 12APR15:15:05:00 and 12APR15:15:09:00 with value of 3.8
As you can see the difference is that my interpretation of the problem with take the PEAK SPAN as being different than what you say, starting one minute earlier but with the same value.
These are all really great answers. I'm pretty new to SAS and already am able to do great things (with your help of course).
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.