Hi,
I have a 1+ year data, every month we need to run a program to procedure a report. In this report, it inculdes:
(1) mean, low range and high range of previous full year (2012) is base on previous full year (2012) data.
(2) Current month do not have full data, so mean, low range and high range will leave blank.
(3) The mean, low, high range for previous month of current year will based on previous 12 months data.
For example:
If I run this report on March 2013, then the mean, low and high range will base on 03/2012 to 02/2013 data.
I created a program to produce this report; however, there are too many steps in the program. I am wondering if there is a way to simplify this program.
The following data I used March 2013 as current month.
Thanks in advance for your help.
data test;
input year month a $ count;
cards;
2012 1 aa 160
2012 1 bb 260
2012 2 aa 760
2012 2 bb 860
2012 3 aa 260
2012 3 bb 960
2012 4 aa 760
2012 4 bb 660
2012 5 aa 560
2012 5 bb 660
2012 6 aa 460
2012 6 bb 460
2012 7 aa 360
2012 7 bb 260
2012 8 aa 260
2012 8 bb 960
2012 9 aa 160
2012 9 bb 860
2012 10 aa 575
2012 10 bb 775
2012 11 aa 690
2012 11 bb 390
2012 12 aa 760
2012 12 bb 560
2013 1 aa 490
2013 1 bb 290
2013 2 aa 809
2013 2 bb 709
2013 3 aa 609
2013 3 bb 309
;
run;
/***** Get the average and std for last full year and current month *****/
proc means data = test maxdec=2 noprint;
where year = 2012 or year = 2013 and month = 3;
class year a;
var count;
output out=avgqt0 mean=avg_count std=stdev;
run;
/***** Create a rolling 12 month date *****/
data test2;
set test;
do i = 0 to 3;
if year = 2013 and month = 3 then delete;
if year = 2012 and month = i-1 then delete;
end;
run;
/***** Compute mean and std for rolling 12 month data *****/
proc means data = test2 maxdec=2 noprint;
class a;
var count;
output out=avgqt1 mean=avg_count std=stdev;
run;
/***** Get the range for previous full year *****/
/***** Set the mean for current month as Null since it is not a full month data *****/
data avgqt00;
set avgqt0(drop=_freq_);
if _type_ in (3);
if year = 2012 then
do;
two_stdev=round(2*stdev);
avg_count = round(avg_count);
low=avg_count-two_stdev;
high=avg_count+two_stdev;
flag = 1;
end;
else
do;
avg_count = .;
flag = 3;
end;
drop _type_ stdev two_stdev;
run;
/***** Get the range for rolling 12 month data*****/
data avgqt11;
set avgqt1(drop=_freq_);
if _type_ in (1);
two_stdev=round(2*stdev);
avg_count = round(avg_count);
low=avg_count-two_stdev;
high=avg_count+two_stdev;
flag = 2;
drop _type_ stdev two_stdev;
run;
PROC SORT DATA = test out=sort2012;
where year = 2012 or year = 2013 and month = 3;
by year a;
run;
/***** Merge previous full year and current month data back to original same period of time data *****/
data all2012;
length flag2 $5;
merge sort2012 (in = B)
avgqt00 (in = C)
;
by year a;
if B;
if year = 2012 then
do;
IF LOW <= a <= HIGH THEN FLAG2='OK';
ELSE FLAG2 ='CHECK';
end;
ELSE flag2 = ' ';
RUN;
PROC SORT DATA = test out=sort2013;
where year = 2013 and month ^= 3;
by year a;
run;
/***** Merge rolling 12 month data back to original current year data except current month *****/
data all2013;
length flag2 $5;
merge sort2013 (in = B)
avgqt11 (in = C)
;
by a;
if B;
IF LOW <= a <= HIGH THEN FLAG2='OK';
ELSE FLAG2 ='CHECK';
RUN;
/***** Combine these two data set to get the final report *****/
data all;
set all2012
all2013;
run;
proc sort data = all;
by year flag;
run;
proc print data = all;
var year month a count avg_count low high flag2;
run;
This is most of it I think. You need to make you date info a SAS date.
The sort is different and I didn't calculate LOW HIGH or FLAG but you should be able to do that ok.
I added a true date column in your test data:
data test;
input year month a $ count;
format dt date9.;
dt = mdy(month,1,year);
cards;
2012 1 aa 160
2012 1 bb 260
2012 2 aa 760
2012 2 bb 860
2012 3 aa 260
2012 3 bb 960
2012 4 aa 760
2012 4 bb 660
2012 5 aa 560
2012 5 bb 660
2012 6 aa 460
2012 6 bb 460
2012 7 aa 360
2012 7 bb 260
2012 8 aa 260
2012 8 bb 960
2012 9 aa 160
2012 9 bb 860
2012 10 aa 575
2012 10 bb 775
2012 11 aa 690
2012 11 bb 390
2012 12 aa 760
2012 12 bb 560
2013 1 aa 490
2013 1 bb 290
2013 2 aa 809
2013 2 bb 709
2013 3 aa 609
2013 3 bb 309
;
run;
proc sql;
select
case
when t1.dt = t2.curr_dt then 'Current'
when t1.dt > intnx('month',t2.curr_dt,-13) then 'Prior Rolling 12 Months'
end as Period
,count(*) as Obs
,avg(count) as Mean_Count
,min(count) as Min_Count
,max(count) as Max_Count
from
test t1,
(select max(dt) format=date9. as curr_dt from test) t2
where calculated period is not null
group by calculated period
;
quit;
produces this dataset:
Current | 2 | 459 | 309 | 609 |
Prior Rolling 12 Months | 24 |
| 160 | 960 |
Thank you so much. Works well too.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.