BookmarkSubscribeRSS Feed
Hanyu
Fluorite | Level 6

1. I have SAS/IML 14.1 and I have simulated 10,000 price paths that are listed as COL1-COL10000 in a SAS dataset. Now I want to do some repeated calculation for each path and I found out that I don't have enough memory. What I observe from windows task manager is that the IML takes up more and more memory as I loop over all paths. 

 

2. I have adjusted SAS profile to enable SAS IML using 12GB of memory. Using IML command SHOW ALLNAMES SPACE;, I found out that there were a lot of empty symbols named as COL1-COL10000, each of which took up 8 byte of memory.

 

3. I suspect this is the reason why I don't have enough memory to conduct my calculations as I reuse other nonempty matrices. I think when I read specific variables into SAS matrix, all variable names are loaded as well and take up a lot of memories.  Could anyone help me? 

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

Are you able to construct the 10.000 price paths without errors?

 

What kind of repeated calculation do you want to do? Perhaps you are better off writing your data to a long SAS data step and use By-Group Processing in an appropriate procedure?

Hanyu
Fluorite | Level 6

Hi draycut, 

 

I can compute 10 paths and still I found out memory is gradually filled up. I did empty the intermediate matrices for reuse and only store the computation result. I want to calculate Net Present Value from each path, and it involves some tax calculation, which is easier in SAS IML. 

Rick_SAS
SAS Super FREQ

You need to show the code so we can see what computations you are doing and how you are doing the computation. I use 10,000 column matrices in simulations regularly, so I don't think the number of variables is the problem. How many observations and what computations are you doing?

 

To get started, you might want to do an internet search such as 

simulate OR simulation "proc iml" site:blogs.sas.com/content/iml

You can add other terms if there are particular tips you are looking for.

 

Hanyu
Fluorite | Level 6

proc iml;


use egsim.pjm_sim_sample;
read all var {date} into full_date;
read all var{col1} into full_pjm_sim1;
close;
full_year=year(full_date);
full_mn=month(full_date);
full_elec=full_pjm_sim1||full_mn||full_year;
pjm_unique_mns=uniqueby(full_elec,2:3,1:nrow(full_elec));

full_year={};
full_mn={};
full_elec={};

 


use egsim.macr;
read all var{depr} into depr;
close;

use egsim.issue_flat_mn;
read all var{prod} into prod_mn;
close;

path_start=1;

path_end=50;

no_npv_paths=path_end-path_start+1;
npv_total=J(48,no_npv_paths,.);
cost=10650000;
pjm_paths=t('col1':'col5000');
srec_paths=t('srec_sim1':'srec_sim5000');

do path_index=path_start to path_end;
pjm_path=pjm_paths[path_index];
srec_path=srec_paths[path_index];
do index=1 to 48;
start=pjm_unique_mns[index];
end=pjm_unique_mns[240+index]-1;
use egsim.pjm_sim_sample;
read point (start:end) var pjm_path into pjm_sim;
read point (start:end) var {date} into date;
close;
use egsim.srec_sim_sample;
read point (index:(120+index-1)) var srec_path into srec_sim;
close;

ebt={};
cashflow={};
net_income={};
tax={};
pv={};
elec_rev={};
total_rev={};
rev={};
ebitda={};
prod_daily={};
fixed_OM={};
ebit={};
ebt_temp={};
tax_rate=(0.35+0.0897);
if year(date[1])=2019 then ITC=0.3;
else if year(date[1])=2020 then ITC=0.26;
else if year(date[1])=2021 then ITC=0.22;
else ITC=0.1;
renumer=cost#ITC;
debt=cost#0.43;

 

degrade=t(1||0.995##(1:19))#87600#0.18;
do prod_index=1 to 20;
yr=pjm_unique_mns[index+12*prod_index]-pjm_unique_mns[index+(prod_index-1)*12];
prod_daily=prod_daily//repeat(degrade[prod_index]/yr,yr,1);
end;

 

elec_day=pjm_sim#prod_daily;
srec_rev=srec_sim#prod_mn[1:120,];
elec_day=month(date)||year(date)||elec_day;

unique_rows=uniqueby(elec_day,1:2,1:nrow(elec_day));


do mn_index=1 to nrow(unique_rows);
/* Next line is for the last BY group */
if mn_index=nrow(unique_rows) then group_index=unique_rows[mn_index]:nrow(elec_day);
else group_index=unique_rows[mn_index]:unique_rows[mn_index+1]-1;
submat=elec_day[group_index,];
elec_rev=elec_rev//submat[+,3];
end;

 



rev1=srec_rev+elec_rev[nrow(srec_rev),];
rev2=elec_rev[(nrow(srec_rev)+1):nrow(elec_rev),];
rev=rev1//rev2;


/*call series(1:nrow(rev),rev);*/
do i=1 to 20;
total_rev=total_rev//sum(rev[(i-1)*12+1:i*12,]);
end;
do i=1 to 20;
fixed_OM=fixed_OM//9#sum(prod_mn[(i-1)*12+1:i*12,])/8760*1000;
end;

ebitda=total_rev-fixed_OM;
depr_val=depr#cost;
ebit1=ebitda[1:nrow(depr_val),]-depr_val;
ebit2=ebitda[nrow(depr_val)+1:nrow(ebitda),];
ebit=ebit1//ebit2;


/*print ebit;*/

ir_pay=mort(cost#0.4,.,0.06,10);

ebt=ebit[1:nrow(ir_pay),]-ir_pay;
ebt=ebt//ebit[nrow(ebt)+1:nrow(ebit),];

/*The ifn function is the same as ifelse function in R*/
tax=ifn(ebt<0,0,.);
/*There is no concern about accumuated loss for the first cashflow*/
if tax[1]=. then tax[1]=ebt[1]#tax_rate;
/*The first pass of looping handles the situation where you have 2 or more years of loss and cannot get positive cashflow by the following year */
do i=1 to nrow(ebt)-1;
if ebt[i,]<0 then do;

ebt_temp=sum(ebt[i:i+1,]);
if ebt_temp<0 then do;
tax[i+1,]=0;
j=1;
/*The until situation handles the situation that the accumulated loss might still cannot be offset after 20 years or
the accumulated loss is longer than 15 years*/

do until(ebt_temp>0 | i+j>20 |j>15);
ebt_temp=sum(ebt[i:i+j,]);
/*The following sentence fill in 0 for all intermediate cashflows which cannot offset accumulated losses*/
tax[i+j-1,]=0;
j=j+1;
end;
/*The max function is to handle the situation if you have accumulated losses after 20 years then the tax should be zero */
tax[i+j-1,]=max(ebt_temp#tax_rate,0);
end;
end;
end;


/*The second pass of looping handles the situation where you have 1 or 0 year of loss
and get positive cashflow by the following year. Or you just have positive cashflows without accumulated losses and must be taxed on them. */
do i=2 to nrow(tax);
if tax[i]=. then do;
if ebt[i]>0 & ebt[i-1]<0 then tax[i]=(ebt[i]+ebt[i-1])#tax_rate;
if ebt[i]>0 & ebt[i-1]>0 then tax[i]=ebt[i]#tax_rate;
end;
end;

net_income=ebt-tax;
/*call series(1:20, total_rev);*/
cashflow=depr_val+net_income[1:nrow(depr),];
cashflow=cashflow//net_income[nrow(depr)+1:nrow(net_income),];
/*print cashflow depr_val net_income;*/
rate=t(1/((1+0.072)##(1:20)));
pv=sum(cashflow#rate);
npv=pv+renumer/(1+0.072)-cost;
npv_pos=path_index-path_start+1;
npv_total[index,npv_pos]=npv;

end;
end;

quit;

 

Hi Rick, since computation also involves looping for 1 path, my logic is to control the number of observations by using the "read point ()  var{} into" technique. You can see that I initialize all intermediate matrix by creating empty ones and fill them up later. So each loop should clear these matrices first from the previous loop and start over. I don't understand why I don't have enough memory.  

 

I can get the calculation for 10 paths but not for 100 paths. I attached a sample of 50 paths for both simulated variables. You can download all files to run the code.  

Hanyu
Fluorite | Level 6
Also I observe the memory is gradually filled up.
Rick_SAS
SAS Super FREQ

I count 12 DO statements (iterations or begin a DO block) and only 11 END statements to end the loop/block. I don't understand how this program could run without error.

Rick_SAS
SAS Super FREQ

I added an extra END; just before the QUIT statement and now the program completes without error.

 

I have a few minor suggestions:

1. Use OPTIONS NONOTES prior to the PROC IML statement to get rid of the thousands of NOTES about closing files during the inner loop.

2. Instead of assigning a symbol the empty matrix, use the FREE statement to clear the symbols you don't need. For example:

 

free ebt cashflow net_income tax pv elec_rev
   total_rev rev ebitda prod_daily fixed_OM
   ebit ebt_temp;

 

3. The keyword coloration and auto-indent facilities in EG can get confused when you use keywords as variable names. You might want to change the variable named END to something like EEND. For example, you could write

 

sstart=pjm_unique_mns[index];
eend=pjm_unique_mns[240+index]-1;
use egsim.pjm_sim_sample;
read point (sstart:eend) var pjm_path into pjm_sim;
read point (sstart:eend) var {date} into date;
close;

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 7 replies
  • 864 views
  • 0 likes
  • 3 in conversation