DATA Step, Macro, Functions and more

Generating rolling 12 month totals for multiple industry codes

Reply
Contributor
Posts: 52

Generating rolling 12 month totals for multiple industry codes

Hello,

 

I am trying to generate 12 month rolling totals for bankruptcy counts for multiple industry codes.  This is what my data looks like for one of those industry codes:

 

Raw_Data.PNG

 

I ran the following code which gives me a running 12-month total for the selected code:

 

************************************
%let roll_num = 12;
data new;
set newview;
array summed[&roll_num] _temporary_;

if E = &roll_num then E = 1;
else E + 1;
summed[E] = Bankruptcy_CA;
if _N_ >= &roll_num then do;
CA_CT = sum(of summed[*]); * CT means cumulative total;
end;
format CA_CT comma8.;
run;

**********************

Which gives me the desired rolling totals:

 

Raw1.PNG

 

The new series is called 'CA_CT'.

 

My question: How do I code this so I can add more than 1 code (I have 600+) so I can generate the kind of rolling totals shown above?  If I add more industry codes to the existing SAS program the rolling totals don't generate properly. 

 

How do I modify the code above?  

 

Thanks

Jack

Super User
Posts: 17,868

Re: Generating rolling 12 month totals for multiple industry codes

Check if you have proc expand, if so look at convert statement. 

Super User
Posts: 6,955

Re: Generating rolling 12 month totals for multiple industry codes

Sort by industry codes and months. Use "by code;" in the data step.

Instead of using _n_, create your own counter (reatain!) that is reset whenever the industry code changes (first.code); you might also clear your array at that point.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 52

Re: Generating rolling 12 month totals for multiple industry codes

Thanks.  Makes sense.  In your statement:

 

Instead of using _n_, create your own counter (reatain!) that is reset whenever the industry code changes (first.code); you might also clear your array at that point.

 

How would I modify my code above to get the desired outcome?  I am fairly new to SAS coding.  Thanks again....

Super User
Posts: 17,868

Re: Generating rolling 12 month totals for multiple industry codes

http://support.sas.com/kb/41/380.html

 

/* create monthly sales data from January 2008 to December 2010 */
data test;
do mo_period = '01jan2008'd to '31dec2010'd;
  sales = round(ranuni(1234567) * 1000, .01);
  mo_period = intnx('month', mo_period, 0, 'END');
  output;
end;
format mo_period monyy7. sales comma10.2;
run;

/*specify the number of periods in the rolling sum and average*/
%let roll_num = 12;

data new;
set test;
/*create array with specific number of elements*/
/*passing _temporary_ arrays as arguments to functions is not supported*/
/*until SAS 9.2*/
/*if the array is a _TEMPORARY_ array, the elements are automatically retained*/
array summed[&roll_num] _temporary_;

/*alternatively, a non-temporary array can be used but must be retained:*/
/*array summed[&roll_num];*/
/*retain summed;*/

/*E represents the element of the array to assign a sales value*/
/*Increment it by one unless it is equal to &roll_num, at which point*/
/*start over and assign it a value of 1. This causes the oldest period to*/
/*be replaced by the newest period once &roll_num periods have been read.*/
if E = &roll_num then E = 1;
else E + 1;
/*assign value to proper element of the array*/
summed[E] = sales;
/*start summing once &roll_num values have been read from the data set*/
if _N_ >= &roll_num then do;
  roll_sum = sum(of summed[*]);
  roll_avg = mean(of summed[*]);
end;
format roll_sum roll_avg comma10.2;
run;

proc print; run;
Super User
Posts: 17,868

Re: Generating rolling 12 month totals for multiple industry codes

Here's a fully worked example, including both using temporary arrays AND proc expand for a second check.

 

proc sort data=sashelp.stocks out=stocks;
by stock date;
run;

data stocks_avg12;
set stocks;
by stock;

array values(12) _temporary_;

if first.stock then do;
call missing(of values(*));
count=0;
end;

count+1;

values(mod(count, 12)+1) = open;

if count>=12 then average=mean(of values(*));

format average dollar12.2;

run;

proc expand data=stocks_avg12 out=stocks_avg12_method2;
by stock;
id  date;
convert open=average2 / transformout=(movave 12 trimleft 11);
run;
Super User
Posts: 6,955

Re: Generating rolling 12 month totals for multiple industry codes


Jack1 wrote:

Thanks.  Makes sense.  In your statement:

 

Instead of using _n_, create your own counter (reatain!) that is reset whenever the industry code changes (first.code); you might also clear your array at that point.

 

How would I modify my code above to get the desired outcome?  I am fairly new to SAS coding.  Thanks again....


This would look like

%let roll_num = 12;
data new;
set newview;
by code;
retain counter;
array summed[&roll_num] _temporary_;
if first.code
then do;
  counter = 1;
  do e = 1 to &roll_num;
    summed[e] = 0;
  end;
end;
else counter + 1;
if E >= &roll_num
then E = 1;
else E + 1;
summed[E] = Bankruptcy_CA;
if counter >= &roll_num
then do;
  CA_CT = sum(of summed[*]); * CT means cumulative total;
end;
format CA_CT comma8.;
run;

Note that I slightly changed the condition where e is reset to 1, as to accommodate the result of using e as loop counter.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 52

Re: Generating rolling 12 month totals for multiple industry codes

Works great Kurt!  Thanks.....Jack

Contributor
Posts: 31

Re: Generating rolling 12 month totals for multiple industry codes

Contributor
Posts: 23

Re: Generating rolling 12 month totals for multiple industry codes

Dear sirs,

 

I have a data set from where i need to generate rolling 12 months sum on all the numeric variables. I'm not able to provide the sample data for you but the variable names are below.

 

Variable NamerepyrrepqtrrepmthdtDSNhbr_pihbr_serhtm_pihtm_serflg_NilSettledPayflg_Windscreenap_indtypeinc_amtt_bndprefix2product_descSource_Systemcnt_Outstand_Sumpay_amtm_Sumpay_amtt_Sumrt_est_amt_Suminc_outstand_Sumrec_settle_Suminc_settle_Sumlife_cyc_Sumcnt_notified_Sumcnt_Settle_Sumcnt_Reopen_Sumrec_amtm_Suminc_amtt_Sumnotificationtime_Sumnillsettled_Sum
TypeDateDateDateCharacterCharacterCharacterCharacterCharacterCharacterCharacterCharacterCharacterCharacterCharacterCharacterCharacterNumericNumericNumericNumericNumericNumericNumericNumericNumericNumericNumericNumericNumericNumericNumeric

 

and my program code is below using proc sql.

 

proc sql nowarn;

create table summary_12MTD as

select distinct t1.repyr label="Reporting Year",t1.repqtr label="Reporting Quarter",t1.repmthdt label="Reporting Month",t1.DSN label="Dataset Name",

t1.hbr_pi label="PI Handling Branch",t1.hbr_ser label="Service Handling Branch",t1.htm_pi label="PI Handling Team",

t1.htm_ser label="Service Handling Team",t1.flg_NilSettledPay label="Nill Settled Flag",

t1.flg_Windscreen label="Windscreen Flag",t1.ap_ind label="PI Indicator",t1.type label="Claim Type",

t1.inc_amtt_bnd label="Incurred Amount Band",t1.prefix2 label="Policy Prefix",

t1.product_desc label="Product Group",t1.Source_System,t1.cnt_Outstand_sum label="Monthly Outstanding Count",

t1.pay_amtm_sum label="Monthly Paid Movement",t1.pay_amtt_sum label="Monthly Total Paid",t1.rt_est_amt_sum label="Estimate Amount",

t1.inc_outstand_sum label="Outstanding Incurred Amount",t1.rec_settle_sum label="Settled Recovered Amount(Monthly)",

t1.inc_settle_sum label="Settled Incurred Amount(Monthly)",t1.life_cyc_sum label="Life cycle(Monthly)",

t1.cnt_notified_sum label="Notified Count(Monthly)",t1.cnt_Settle_sum label="Settled Count(Monthly)",

t1.cnt_Reopen_sum label="Reopen Count(Monthly)",t1.rec_amtm_sum label="Recovered Movement(Monthly)",

t1.inc_amtt_sum label="Total Incurred Amount(Monthly)",t1.notificationtime_sum label="Notification days(Monthly)",

t1.nillsettled_sum label="Nill Settled Count(Monthly)",

sum(t2.life_cyc_sum) as life_cyc_12MTD label="12MTD Lifecycle",

sum(t2.pay_amtt_sum) as pay_amtt_12MTD label="12MTD Total Paid",

sum(t2.rec_settle_sum) as rec_settle_12MTD label="12MTD Settled Recovered Amount",

sum(t2.inc_settle_sum) as inc_settle_121MTD label="12MTD Settled Incurred Amount",

sum(t2.cnt_notified_sum) as cnt_notified_12MTD label="12MTD Notified Count",

sum(t2.cnt_Settle_sum) as cnt_Settle_12MTD label="12MTD Settled Count",

sum(t2.cnt_Reopen_sum) as cnt_Reopen_12MTD label="12MTD Reopened Count",

sum(t2.inc_amtt_sum) as inc_amtt_12MTD label="12MTD Incurred Total",

sum(t2.notificationtime_sum) as notificationtime_12MTD label="12MTD Notification Time",

sum(t2.nillsettled_sum) as nillsettled_12MTD label="12MTD Nill Settled Count"

from work.summary t1 left join work.summary t2

on t1.DSN =t2.DSN

and t1.hbr_pi =t2.hbr_pi

and t1.hbr_ser =t2.hbr_ser

and t1.htm_pi =t2.htm_pi

and t1.htm_ser =t2.htm_ser

and t1.flg_NilSettledPay =t2.flg_NilSettledPay

and t1.flg_Windscreen =t2.flg_Windscreen

and t1.ap_ind =t2.ap_ind

and t1.type =t2.type

and t1.inc_amtt_bnd =t2.inc_amtt_bnd

and t1.prefix2 =t2.prefix2

and t1.product_desc =t2.product_desc

and t1.Source_System =t2.Source_System

and intck("month",t1.repmthdt,t2.repmthdt) between 0 and -11

group by t1.repyr,t1.repqtr,t1.repmthdt,t1.DSN,t1.hbr_pi,t1.hbr_ser,t1.htm_pi,t1.htm_ser,t1.flg_NilSettledPay,

t1.flg_Windscreen,t1.ap_ind,t1.type,t1.inc_amtt_bnd,t1.prefix2,t1.product_desc,t1.Source_System,

t1.nillsettled_Sum,t1.notificationtime_Sum,t1.inc_amtt_Sum,t1.rec_amtm_Sum,t1.cnt_Reopen_Sum,

t1.cnt_Settle_Sum,t1.cnt_notified_Sum,t1.life_cyc_Sum,t1.inc_settle_Sum,t1.rec_settle_Sum,

t1.inc_outstand_Sum,t1.rt_est_amt_Sum,t1.pay_amtt_Sum,t1.pay_amtm_Sum,t1.cnt_Outstand_Sum

;

quit;

 

 

I'm not able to get the correct 12MTD values...

 

ANY HELP IS MUCH APPRECIATED.

Super User
Posts: 17,868

Re: Generating rolling 12 month totals for multiple industry codes

@Reddi Please post your question as it's own new question and note that rolling statistics are more easily calculated in a data step. Especially if you want to do all numeric variables.

Ask a Question
Discussion stats
  • 10 replies
  • 663 views
  • 4 likes
  • 5 in conversation