BookmarkSubscribeRSS Feed
alatriste26
Calcite | Level 5

Hi! I have the following code for January that gives me a final table for this month. I need to run this code for all 12 months. Every time I start with a new month I retrieve the information from two datasets; sample_returns and sample_stocks. Since it takes some time so run it all I was wondering if there was a simple loop that  I could use to make the process faster. 

 

data sample_returns_jan;
set sample_returns;
where month(date)=1;
run;

data sample_stocks_jan;
set sample_stocks;
where month(date)=1;
run;

proc sql;
create table SampleJanuary as 
select i.*, cq.date, cq.ret
from sample_stocks_jan i left join sample_returns_jan cq
on i.date = cq.date and i.bucket=cq.bucket and i.permno=cq.permno;

proc sql;
create table sampleJanuary1 as
select i.*, sum(EW*ret) as EW_ret, sum((sqrt(12)*(volatility))*100) as vol_annual,
sum(((1+ret)**12)-1) as ret_annual
from SampleJanuary i
group by month_bucket;

data sampleJanuary1a;
set samplejanuary1;
SP_annual= sum(ret_annual/vol_annual);
by month_bucket;
run;

/*Keeping only the relevant information*/
data final_January;
set sampleJanuary1a;
keep bucket date EW_ret vol_annual SP_annual;
run;

4 REPLIES 4
PaigeMiller
Diamond | Level 26

Unless you have a HUUUUGE data set that your hardware can't handle easily, looping is not necessary and will not speed things up.


It seems to me that everything you are doing can be done on the entire data set, without splitting them into data sets for each month. Yes, it may take a long time but it will still be faster that way.

--
Paige Miller
pmbrown
Quartz | Level 8

put a macro around it and make the month a macro variable and wrap the code in a do loop:

%do month=1 %to 12; 
:
:
%end;

 just a thought

ChrisNZ
Tourmaline | Level 20

Further to @PaigeMiller 's comment, I contend that the 2 SQL steps can be grouped, and that the last 2 steps can be removed altogether as they bring no value, and their logic added to the single SQL step. 

 

In other words, the 12 times 6 steps can be reduced to one SQL query. The query will be much faster if the two sources tables are sorted and/or indexed.

PaigeMiller
Diamond | Level 26

Adding to my original reply and to @ChrisNZ 's comments

 

Your original code, with WHERE statements in data steps will require SAS to read through the original data set 12 times (if you do this in a loop). If you do it as one big (non-loop) action, SAS has to read through the original data set only one time.

 

AND

 

The non-loop solution is a lot less programming than the loop solution.

--
Paige Miller

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1153 views
  • 2 likes
  • 4 in conversation