Hi! I have the following code for January that gives me a final table for this month. I need to run this code for all 12 months. Every time I start with a new month I retrieve the information from two datasets; sample_returns and sample_stocks. Since it takes some time so run it all I was wondering if there was a simple loop that I could use to make the process faster.
data sample_returns_jan; set sample_returns; where month(date)=1; run; data sample_stocks_jan; set sample_stocks; where month(date)=1; run; proc sql; create table SampleJanuary as select i.*, cq.date, cq.ret from sample_stocks_jan i left join sample_returns_jan cq on i.date = cq.date and i.bucket=cq.bucket and i.permno=cq.permno; proc sql; create table sampleJanuary1 as select i.*, sum(EW*ret) as EW_ret, sum((sqrt(12)*(volatility))*100) as vol_annual, sum(((1+ret)**12)-1) as ret_annual from SampleJanuary i group by month_bucket; data sampleJanuary1a; set samplejanuary1; SP_annual= sum(ret_annual/vol_annual); by month_bucket; run; /*Keeping only the relevant information*/ data final_January; set sampleJanuary1a; keep bucket date EW_ret vol_annual SP_annual; run;
Unless you have a HUUUUGE data set that your hardware can't handle easily, looping is not necessary and will not speed things up.
It seems to me that everything you are doing can be done on the entire data set, without splitting them into data sets for each month. Yes, it may take a long time but it will still be faster that way.
put a macro around it and make the month a macro variable and wrap the code in a do loop:
%do month=1 %to 12; : : %end;
just a thought
Further to @PaigeMiller 's comment, I contend that the 2 SQL steps can be grouped, and that the last 2 steps can be removed altogether as they bring no value, and their logic added to the single SQL step.
In other words, the 12 times 6 steps can be reduced to one SQL query. The query will be much faster if the two sources tables are sorted and/or indexed.
Adding to my original reply and to @ChrisNZ 's comments
Your original code, with WHERE statements in data steps will require SAS to read through the original data set 12 times (if you do this in a loop). If you do it as one big (non-loop) action, SAS has to read through the original data set only one time.
AND
The non-loop solution is a lot less programming than the loop solution.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.