DATA Step, Macro, Functions and more

Making If statement more efficient.

Reply
Contributor
Posts: 53

Making If statement more efficient.

Hi, is there a way to make if statement more efficient? I ve got about 80 mln obs with over 80 if statments in loop, processing this dataset takes couple of hours? Belowe you will find scrap of the code:

 

%Macro test(x_date)
data want;
set have;
		%do i=3 %to 12 %by 3; 
			if datepart(date)>=intnx('month', "&x_date"d,%eval(-&i+1), 'B') and datepart(data_danych)<=intnx('month', "&x_date"d,0,'E') and kod_3 = '1001' then do;
		  var1_&i.m=round(amount*exch_rete,0.01);
			end;

			else if datepart(date)>=intnx('month', "&x_date"d,%eval(-&i+1), 'B') and datepart(data_danych)<=intnx('month', "&x_date"d,0,'E') and kod_3 = '1201' then do;
			var2_w_UP_&i.m=round(amount*exch_rate,0.01);
			end;

			else if datepart(date)>=intnx('month', "&x_date"d,%eval(-&i+1), 'B') and datepart(data_danych)<=intnx('month', "&x_date"d,0,'E') and kod_3 in ('1041','1241') then do;
			 var3_&i.m=round(amount*exch_rate,0.01);
			end;

........
%end;

run;
%mend;
test(31AUG2017);

 

 

 

Thank you.

PROC Star
Posts: 755

Re: Making If statement more efficient.

Why do you want to do this in a macro? Why not do it with just the data step?

 

You say you have 80 different if statements, but are the two first conditions in your if statements always the same like they are below, so only the third condition differ?

 

 

Contributor
Posts: 53

Re: Making If statement more efficient.

[ Edited ]

All conditions are differ but variables are always the same, actualy I do this in macro, due to &i macro variable.

PROC Star
Posts: 755

Re: Making If statement more efficient.

In my opinion, you should not use macro programming if it is not necessary Smiley Happy

 

If you have the same pattern in all your if statements, where the two first conditions are the same, there is no need to test these two conditions in every if statement. Instead do something like this

 

%do i=3 %to 12 %by 3; 
	if datepart(date)>=intnx('month', "&x_date"d,%eval(-&i+1), 'B') and datepart(data_danych)<=intnx('month', "&x_date"d,0,'E') then do;
		if kod_3 = '1001' then var1_&i.m=round(amount*exch_rete,0.01);
		else if kod_3 = '1201' then var2_w_UP_&i.m=round(amount*exch_rate,0.01);
		else if kod_3 in ('1041','1241') then var3_&i.m=round(amount*exch_rate,0.01);
	end;
%end;
Super User
Posts: 7,809

Re: Making If statement more efficient.

And this:

if kod_3 = '1001' then var1_&i.m=round(amount*exch_rete,0.01);
else if kod_3 = '1201' then var2_w_UP_&i.m=round(amount*exch_rate,0.01);
else if kod_3 in ('1041','1241') then var3_&i.m=round(amount*exch_rate,0.01);

looks much better when coded as

select (kod_3);
  when ('1001') var1_&i.m=round(amount*exch_rete,0.01);
  when ('1201') var2_w_UP_&i.m=round(amount*exch_rate,0.01);
  when ('1041','1241') var3_&i.m=round(amount*exch_rate,0.01);
  otherwise;
end;

and might even perform better.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 100

Re: Making If statement more efficient.

[ Edited ]

Try this.....

data KOD_3_VALUES;
  infile datalines;
  input cond $30.;
datalines;
='1001'
='1201'
in ('1041','1241')
;
run;


data _null_;
  set KOD_3_VALUES end=obs;
  call symput(compress("cond"||_n_), cond);
  if obs then call symput('totobs',_n_);
run;

%macro TestMyConditions;
  %do i=1 %to &totobs.;
    %if %eval(&i=1) %then %put if kod_3 %bquote(&&cond&i);
       %else  %if %eval(&i>1) %then %put if kod_3 %bquote(&&cond&i);
  %end;
%mend;

%TestMyConditions;

 

Hope this helps.

Super User
Super User
Posts: 7,970

Re: Making If statement more efficient.

It would help a lot if you actually showed some test data (in the form of a datastep using the code window {i}).  It looks to me like using arrays, i.e. a reference to a groups of variables, would be far simpler than doing lots of if statements.  The code you have shown doesn't really show any differences??  One other option, which will likely make your life a lot easier is to normalise the data, i.e. have the various variables going down the page rather than across, you would need a lot of loops and arrays that way.  Again, post test data so we can work on it, and show what you want out.  Otherwise we are just guessing.

Super User
Posts: 5,513

Re: Making If statement more efficient.

@Matt3,

 

IF / THEN runs pretty quickly.  The CPU hog here is the number of functions you are calling.  You can reduce them considerably by making a few changes;

 

data want;

set have;

dp = datepart(date);

 

Now you never need another DATEPART function in the rest of your code ... just refer to the variable DP.

 

%do i=3 %to 12 %by 3;

   range_begins = intnx('month', "&x_date"d, %eval(-&i+1), 'B');

   range_ends = intnx('month', "&x_date"d, 0, 'E');

   ...

%end;

 

Again, you don't need any additional INTNX functions.  Just refer to the variables RANGE_BEGINS and RANGE_ENDS.

 

A much smaller savings could then be achieved by nesting the IF / THEN conditions.  For example, you could try:

 

if range_begins <= dp <= range_ends then do;

   if kod_3 = '1001' then ...

   else if kod_3 = '1201' then ...

   else etc. ...

end; 

Super User
Super User
Posts: 7,060

Re: Making If statement more efficient.

Hard to tell what the logic is without either specifications or data, but couldn't you turn this loop around.

 

%do i=3 %to 12 %by 3; 
if datepart(date)>=intnx('month', "&x_date"d,%eval(-&i+1), 'B')
 and datepart(data_danych)<=intnx('month', "&x_date"d,0,'E') then do;
  if  kod_3 = '1001' then do;
    var1_&i.m=round(amount*exch_rete,0.01);
  end;
...
end;
%end;

 It looks like you are dividing things into quarters and updating a series of quarter based varaibles.

Couldn't you just calculate the quarter from the date and then generate the index?

 

 

Super User
Super User
Posts: 7,060

Re: Making If statement more efficient.

[ Edited ]

Don't call functions to recalculate on every observations values that are constants.

Don't call functions you don't need.

So you would change something like this:

%do i=3 %to 12 %by 3; 
if datepart(date)>=intnx('month', "&x_date"d,%eval(-&i+1), 'B')
 and datepart(data_danych)<=intnx('month', "&x_date"d,0,'E') then do;

to:

%do i=3 %to 12 %by 3; 
 %let lower=%sysfunc(intnx(dtmonth,"&x_date:00:00"dt,-&i+1,b));
 %let upper=%sysfunc(intnx(dtmonth,"&x_date:00:00"dt,0,e));
if date >= &lower and date_danych <= &upper then do;

So calculate the constants that are based on macro variables since they are know before the data step is even compiled.

Also calculate the constants as datetime values instead of date values so that you no longer need to call the DATEPART() function multiple times on each observation.

 

Ask a Question
Discussion stats
  • 9 replies
  • 142 views
  • 2 likes
  • 7 in conversation