Hi, I am progressing through my code and have an annoyingly long calculation that is very repetitive, however, one element of it uses a calculation as a one off. I want to be able to simplify this so I am not writing it multiple times. I have included my test attempt code at the bottom to show what I was trying to do (but failing miserably)
data calc;
set withdrawal_data;
do wdwl_FC_1=((Bal_Oct20+(Bal_Oct20*Money_Out_1))+(Adjustments_1*1000000)); end;
do wdwl_FC_2 = ((wdwl_FC_1+(wdwl_FC_1*Money_Out_1))+(Adjustments_1*1000000));
wdwl_FC_3 = ((wdwl_FC_2+(wdwl_FC_2*Money_Out_2))+(Adjustments_2*1000000));
wdwl_FC_4 = ((wdwl_FC_3+(wdwl_FC_3*Money_Out_3))+(Adjustments_3*1000000));
wdwl_FC_5 = ((wdwl_FC_4+(wdwl_FC_4*Money_Out_4))+(Adjustments_4*1000000));
wdwl_FC_6 = ((wdwl_FC_5+(wdwl_FC_5*Money_Out_5))+(Adjustments_5*1000000));
wdwl_FC_7 = ((wdwl_FC_6+(wdwl_FC_6*Money_Out_6))+(Adjustments_6*1000000));
wdwl_FC_8 = ((wdwl_FC_7+(wdwl_FC_7*Money_Out_7))+(Adjustments_7*1000000));
wdwl_FC_9 = ((wdwl_FC_8+(wdwl_FC_8*Money_Out_8))+(Adjustments_8*1000000));
wdwl_FC_10 = ((wdwl_FC_9+(wdwl_FC_9*Money_Out_9))+(Adjustments_9*1000000));
wdwl_FC_11 = ((wdwl_FC_10+(wdwl_FC_10*Money_Out_10))+(Adjustments_10*1000000));
wdwl_FC_12 = ((wdwl_FC_11+(wdwl_FC_11*Money_Out_11))+(Adjustments_11*1000000));
end;
run;
*Note this goes on for 60 iterations so I have only included 12 here and in my test
I have done the first (different) calc separately in another dataset in my test to try and get round the issue with the following code but
get errors in the arrays bit. Please can you help resolve this for me as driving me mad again. If there is a better way to do this instead of the arrays please let me know.
data calc;
set wdwl_calc;
wdwl_FC_202010=((Bal_Oct20+(Bal_Oct20*Money_Out_202010))+(Adjustments_202010*1000000));
run;
data fcast;
set calc;
run;
data tester;
set fcast;
array money (202010:202109) Money_Out_202010  Money_Out_202109;
array wdwl (202010:202109) Adjustments_202010  Adjustments_202109;
array fcwdrawal(12);
array wdrwal (202010:202109) Withdrawal_FC_202010  Withdrawal_FC_202109;
do i = 1 to 12;
fcwdrawal(i) = ((Withdrawal_FC_202010+(fcwdrawal(i)*money(i)))+(wdwl(i)*1000000));
end;
run;
Thank you
David
From your code, I take it that you suffer from a bad dataset layout (data (dates) in structure (column names)).
You should first transpose that to a long layout, and then you do a running calculation with retained variables.
For help with the transposition, supply a short example of your dataset, ideally as a data step with datalines.
So as the data has each element with a date and month in it ie 202010 for October 20 what would I need to change the dimensions to in order to get it to pull these through.
@djk200399 wrote:
So as the data has each element with a date and month in it ie 202010 for October 20 what would I need to change the dimensions to in order to get it to pull these through.
Show us a portion of the data, following these instructions: https://blogs.sas.com/content/sastraining/2016/03/11/jedisastricksdatatodatastepmacro/
Fix the index into the array.
data tester;
set fcast;
array money Money_Out_202010  Money_Out_202012 Money_Out_202101Money_Out_202109;
array wdwl Adjustments_202010  Adjustments_202012 Adjustments_202101  Adjustments_202109;
array fcwdrawal [12];
array wdrwal Withdrawal_FC_202010  Withdrawal_FC_202012 Withdrawal_FC_202101  Withdrawal_FC_202109;
do i = 1 to 12;
fcwdrawal[i] = ((Withdrawal_FC_202010+(fcwdrawal[i]*money[i]))+(wdwl[i]*1000000));
end;
run;
Hi Tom,
So are you suggesting if I have an array with the multiple years noted next to it, as you have done here it should fix the issue?
I will give this a go on the code later on. Thank you
Hi Tom,
I tried your suggestion but get an error saying array subscript out of range at line 30 column 20. I will try and sort the sharing of a dataset on Monday that way it might be easier to see what I am doing wrong !
I have transposed my datasets (which I know do all the time when reading them in from excel) to improve things but need to work out how to do this via an array.
I have 2 data sets that look like this
Dataset A  
no  brand  product  period  acc 
1  A  Regular  202010  300 
2  A  Fixed  202010  350 
3  B  Regular  202010  250 
4  B  Fixed  202010  200 
5  C  Regular  202010  150 
1  A  Regular  202011  250 
2  A  Fixed  202011  250 
3  B  Regular  202011  400 
4  B  Fixed  202011  200 
5  C  Regular  202011  150 
1  A  Regular  202012  150 
2  A  Fixed  202012  160 
3  B  Regular  202012  300 
4  B  Fixed  202012  200 
5  C  Regular  202012  400 
1  A  Regular  202101  450 
2  A  Fixed  202101  325 
3  B  Regular  202101  350 
4  B  Fixed  202101  275 
5  C  Regular  202101  190

Dataset B  
no  brand  product  period  int 
1  A  Regular  202011  10 
2  A  Fixed  202011  12 
3  B  Regular  202011  12 
4  B  Fixed  202011  10 
5  C  Regular  202011  15 
1  A  Regular  202012  20 
2  A  Fixed  202012  13 
3  B  Regular  202012  12 
4  B  Fixed  202012  5 
5  C  Regular  202012  7 
1  A  Regular  202101  2 
2  A  Fixed  202101  11 
3  B  Regular  202101  3 
4  B  Fixed  202101  3 
5  C  Regular  202101  14 
1  A  Regular  202102  12 
2  A  Fixed  202102  12 
3  B  Regular  202102  12 
4  B  Fixed  202102  10 
5  C  Regular  202102  10 
I want to use the 2 datasets to undertake an ongoing calculation
The first dataset is one that was created last month with forecast numbers going forward from an actual position in 202010
The second is the new calculation that needs to take place on this dataset but using the prior month result
So in the case of no1 the first calc should be acc for 202010 from dataset A plus int for 202011 from dataset B. The second calc should use the outcome of the first calc as it's starting point and then add int for 202012 from dataset B and so on until the end.
How do I use the result of each calc for the next calculation simply and efficiently ?
What I should see here results wise using no1 as an example is
calc  
202011  310 
202012  330 
202101  332 
202102  344 
I hope this makes sense and would an array work to do this or am I over complicating things??
Thanks in advance of your help again
Repeating my earlier request to you in this thread
Show us a portion of the data, following these instructions: https://blogs.sas.com/content/sastraining/2016/03/11/jedisastricksdatatodatastepmacro/
I have tried running the said macro but it does not work in my SAS given some of the internal restrictions. Instead I have attached the two datasets (simplified versions) as csv's for a quick upload.
So for these I need a calculation of acc from accrual dataset at 202010 + the value of calc from the all1 dataset for 202011 merged by the no1 field and then the next calculation will be the last calculation we did + value of calc from the all dataset for 202012 merged by the no1 field and so on.
Appreciate your help on this.
There is no variable int in any of your CSV's.
PLEASE POST USABLE DATA.
(which fits your descriptions, or adapt the descriptions to the data)
proc sort
data=ds_a (where=(period='01oct2020'd))
out=ds_a_
;
by brand product;
run;
proc sort
data=ds_b (where=(period ge '01nov2020'd))
out=ds_b_
;
by brand product_period;
run;
data want;
merge
ds_a_
ds_b_
;
by brand product;
if first.product
then calc = acc;
calc + int;
keep brand product calc;
run;
Untested, for lack of usable data.
Apologies for not including the correct data, I managed to do a similar solution to yours Kurt so apologies for the failure on my part
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 69 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.