Help using Base SAS procedures

calculating averages on floating period

Accepted Solution Solved
Reply
Regular Contributor
Posts: 229
Accepted Solution

calculating averages on floating period

Hello,

My dataset looks likes this:

DATE           MONTH      VALUE

01032011      MAR2011    8.50

02032011      MAR2011    8.50

02032011      MAR2011    5.50

02032011      MAR2011    8.50

14032011      MAR2011    7.50

24032011      MAR2011    9.00

24042011      APR2011     8.50

24042011      APR2011     8.50

26042011      APR2011     2.50

24042011      APR2011     3.50

16052011      MAY2011     2.50

24052011      MAY2011     13.50

25052011      MAY2011     4.50

26052011      MAY2011     3.50

27052011      MAY2011     2.50

25062011      JUN2011     10.50

26062011      JUN2011     13.50

27062011      JUN2011     12.50

28062011      JUN2011     13.50

According to 2 macro variables i have to calculate floating averages/median/percentile.

%let calc = average;

%let bin = 3;

The number of bin is in months. So for this example if i want to calculate the number for JUN2011 the number has to be calculated on the values of APR , MAY and JUN. If the variable bin is 2 i have to calculate JUN on JUN and MAY , MAY on APR and MAY and so on..

Could anyone set me on my way? I have no idea how to start on this one.

Regards,

Filip


Accepted Solutions
Solution
‎03-19-2012 09:41 AM
Super User
Posts: 9,682

Re: calculating averages on floating period

Actually, I have another way. If you don't mind.

data x(drop=month);
input DATE  : ddmmyy12.     MONTH  : monyy5.    VALUE ;
format date date9. month monyy5.;
cards;
01032011      MAR2011    8.50
02032011      MAR2011    8.50
02032011      MAR2011    5.50
02032011      MAR2011    8.50
14032011      MAR2011    7.50
24032011      MAR2011    9.00
24042011      APR2011     8.50
24042011      APR2011     8.50
26042011      APR2011     2.50
24042011      APR2011     3.50
16052011      MAY2011     2.50
24052011      MAY2011     13.50
25052011      MAY2011     4.50
26052011      MAY2011     3.50
27052011      MAY2011     2.50
25062011      JUN2011     10.50
26062011      JUN2011     13.50
27062011      JUN2011     12.50
28062011      JUN2011     13.50
;
run;
data x;
 set x;
 monyy=mdy(month(date),1,year(date));
 format monyy monyy5.;
run;




%macro cal(monyy=);
data xx;
 set x(where=(monyy between intnx('month',&monyy,-&bin+1,'b') and &monyy));
 month=put(&monyy,monyy5.);
run;
proc summary data=xx nway;
 class month;
 var value;
 output out=result(drop=_:)  mean= median= p75= /autoname;
run;
%mend cal;


%let bin = 3;

proc sort data=x(keep=monyy) out=temp nodupkey;by monyy;run;
data _null_;
 set temp;
 call execute('%cal(monyy='||strip(monyy)||')');
 if _n_ eq 1 then call execute('data final;set result;stop;run;');
 call execute('proc append base=final data=result force;run;');
run;

 


Ksharp

View solution in original post


All Replies
Super User
Posts: 9,682

calculating averages on floating period

What output do you want?

N/A
Posts: 1

calculating averages on floating period

Do you want average for all the previous months (as value in bin macro variable) for every row in a datamart/datalines?

Regular Contributor
Posts: 229

calculating averages on floating period

Output would be month + average (could be median or percentile as well)

Also it could be a dataset with more then this 4 months but 4 years for example.

so in this case (numers are not correct).

Month        Value

MAR2011   5                will have an average only of March because no data available for JAN and FEB

APR2011   10                will have an average of March and Apr together

MAY2011   11                will have an average of March and Apr and May together

JUN2011    12                will have an average of Apr and May and Juntogether

Super User
Posts: 9,682

calculating averages on floating period

OK. Try this.

data x;
input DATE  : ddmmyy12.     MONTH  : monyy5.    VALUE ;
format date date9. month monyy5.;
cards;
01032011      MAR2011    8.50
02032011      MAR2011    8.50
02032011      MAR2011    5.50
02032011      MAR2011    8.50
14032011      MAR2011    7.50
24032011      MAR2011    9.00
24042011      APR2011     8.50
24042011      APR2011     8.50
26042011      APR2011     2.50
24042011      APR2011     3.50
16052011      MAY2011     2.50
24052011      MAY2011     13.50
25052011      MAY2011     4.50
26052011      MAY2011     3.50
27052011      MAY2011     2.50
25062011      JUN2011     10.50
26062011      JUN2011     13.50
27062011      JUN2011     12.50
28062011      JUN2011     13.50
;
run;
%let bin=3;
proc sql ;
create table want(drop=n) as
 select month,(select avg(value)  from x where month between intnx('month',a.month,-&bin,'s') and a.month) as mean,count(*) as n
  from x as a
   group by month;
quit;

Ksharp

Regular Contributor
Posts: 229

Re: calculating averages on floating period

Ksharp, thanks a lot. This will help me a lot already.

A question that pops into my mind: how to do percentile 70 for example in sql? Median is just an other function but does their exist a percentile function as well? 

Regular Contributor
Posts: 229

Re: calculating averages on floating period

Unfortunate.. something is going wrong when I test it on my data:

This is the table want (with N kept). I get for APR2011 106 rows=n? With a mean which is always different.. any idea?

APR2011 12.382727273 106

APR2011 12.300052083 106

APR2011 12.17015873 106

APR2011 12.363972603 106

APR2011 12.268622754 106

APR2011 11.871 106

APR2011 12.1575 106

APR2011 12.1575 106

APR2011 12.241363636 106

APR2011 11.871 106

APR2011 12.426490385 106

APR2011 11.968776978 106

APR2011 12.219783784 106

APR2011 12.27355 106

APR2011 12.23695122 106

APR2011 11.886891892 106

EDIT: I think the problem is that, in comparison with your datalines the real numberic value of month is different here..

I tried to replace it with month_char but it does not work

Super User
Posts: 9,682

Re: calculating averages on floating period

It is obvious that your real data is different from me.

Are your data(APR2011) the real date value ?

I use monyy. input APR2011 that means 01APR2011, so in my example all the APR2011 has the same value 01APR2011.

and I guess you should need to create a new variable like mine.

month=mdy(month(date),1,year(date));

Ksharp

Super User
Posts: 9,682

Re: calculating averages on floating period

Opps. for your situation.

You should change %let bin=3 into %let bin=2 , because you only need two month back.

Super User
Posts: 9,682

Re: calculating averages on floating period

Sorry. You can try this.

data x;
input DATE  : ddmmyy12.     MONTH  : monyy5.    VALUE ;
format date date9. month monyy5.;
cards;
01032011      MAR2011    8.50
02032011      MAR2011    8.50
02032011      MAR2011    5.50
02032011      MAR2011    8.50
14032011      MAR2011    7.50
24032011      MAR2011    9.00
24042011      APR2011     8.50
24042011      APR2011     8.50
26042011      APR2011     2.50
24042011      APR2011     3.50
16052011      MAY2011     2.50
24052011      MAY2011     13.50
25052011      MAY2011     4.50
26052011      MAY2011     3.50
27052011      MAY2011     2.50
25062011      JUN2011     10.50
26062011      JUN2011     13.50
27062011      JUN2011     12.50
28062011      JUN2011     13.50
;
run;
%let bin=2;
proc sql ;
create table want as
 select distinct month,(select mean(value) as mean  from x where month between intnx('month',a.month,-&bin,'b') and a.month) as mean
  from x as a ;
quit;

Ksharp

Regular Contributor
Posts: 151

Re: calculating averages on floating period

I had the idea of creating multilabel formats from the dates in the data, this will enable you to use certain SAS procedures (SUMMARY, MEANS, TABULATE, REPORT) to calculate the measures of interest.  This means you have access to the percentiles 1,5,10,25,50,75,90,95,99.  You would normally use PROC UNIVARIATE to calculate other percentiles, but unfortunately this procedure doesn't support multilabel formats.

Here is my code, which uses the intitial dataset created by KSharp (although I don't need the MONTH column).

%let bin=3;

proc sql noprint;

select min(date), intck('month',min(date),max(date)) into :min_dt, :num_mths

from x;

quit;

 

data mth_fmt;

format start end monyy5.;

retain fmtname 'roll_mth' type 'N' hlo 'M';

do i=0 to &num_mths.;

start=intnx('month',&min_dt.,i-&bin.+1);

end=intnx('month',start,&bin.-1,'e');

label=vvalue(end);

output;

end;

drop i;

run;

 

proc format cntlin=mth_fmt;

run;

 

proc summary data=x nway;

class date / mlf order=data;

format date roll_mth.;

var value;

output out=want (drop=_Smiley Happy mean= median= p75= n= / autoname;

run;

Super User
Posts: 9,682

Re: calculating averages on floating period

Keith,

Congratulations! That is a wonderful solution. Educated.

Ksharp

Regular Contributor
Posts: 229

Re: calculating averages on floating period

Indeed Keith, your solution works perfectly. Thanks both for your help.

Solution
‎03-19-2012 09:41 AM
Super User
Posts: 9,682

Re: calculating averages on floating period

Actually, I have another way. If you don't mind.

data x(drop=month);
input DATE  : ddmmyy12.     MONTH  : monyy5.    VALUE ;
format date date9. month monyy5.;
cards;
01032011      MAR2011    8.50
02032011      MAR2011    8.50
02032011      MAR2011    5.50
02032011      MAR2011    8.50
14032011      MAR2011    7.50
24032011      MAR2011    9.00
24042011      APR2011     8.50
24042011      APR2011     8.50
26042011      APR2011     2.50
24042011      APR2011     3.50
16052011      MAY2011     2.50
24052011      MAY2011     13.50
25052011      MAY2011     4.50
26052011      MAY2011     3.50
27052011      MAY2011     2.50
25062011      JUN2011     10.50
26062011      JUN2011     13.50
27062011      JUN2011     12.50
28062011      JUN2011     13.50
;
run;
data x;
 set x;
 monyy=mdy(month(date),1,year(date));
 format monyy monyy5.;
run;




%macro cal(monyy=);
data xx;
 set x(where=(monyy between intnx('month',&monyy,-&bin+1,'b') and &monyy));
 month=put(&monyy,monyy5.);
run;
proc summary data=xx nway;
 class month;
 var value;
 output out=result(drop=_:)  mean= median= p75= /autoname;
run;
%mend cal;


%let bin = 3;

proc sort data=x(keep=monyy) out=temp nodupkey;by monyy;run;
data _null_;
 set temp;
 call execute('%cal(monyy='||strip(monyy)||')');
 if _n_ eq 1 then call execute('data final;set result;stop;run;');
 call execute('proc append base=final data=result force;run;');
run;

 


Ksharp

Regular Contributor
Posts: 229

Re: calculating averages on floating period

thanks for both answers, it helped me a lot in my coding

though i experience an other difficulty at the moment:

%let default_size_definition=70;


proc summary data=finished_requests nway;
class date / mlf order=data;
format date roll_mth_calc.;
var quota;
output out=TQS (drop=_Smiley Happy p&default_size_definition.=TQS;
run;

this will not work as proc summary uses only 75 and not 70

so i found i can use proc univariate but the format does not work here? any help?

proc univariate data = finished_requests noprint;
class date / mlf order=data;
format date roll_mth_calc.;
var quota;
output out = caprate
pctlpts = 10 to 100 by 10
pctlpre = pct_;
run;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 18 replies
  • 825 views
  • 8 likes
  • 4 in conversation