A data step or macro which is efficient?Give reasons

venkatnaveen · Posted 08-24-2014 01:43 PM

EfEfficiency:

If the dataset has one million of observations, which one of the following program is more efficient in terms of reducing the cpu time.

Partial dataset:

Dataset:

Var3 Seq1 var1 var2

US 100 10 1

EU 200 20 2

IND 300 30 3

Program No.1

Data US IND EU;

Set dataset;

If var3=‘US’ then output US;

If var3=’EU’ then output EU;

If var3=’IND’ then output IND;

Run;

Program no. 2

%macro report(country);

Data &country;

Set dataset;

If var3= ‘’&country ‘’ then output &country;

Run;

%mend report;

%report(US)

%report(EU)

%report(IND)

Peter_C · Posted 08-24-2014 01:45 PM

I expect program 1 would be faster because it reads the input only once.

Both programs write the same three dataset.

SASKiwi · Posted 08-24-2014 03:44 PM

The following logic is quicker also. And if you put the IF statements in order of most frequent then that is faster again (US has more rows than EU and EU has more rows than IND).

If var3=‘US’ then output US;

else If var3=’EU’ then output EU;

else If var3=’IND’ then output IND;

LinusH · Posted 08-25-2014 10:11 AM

Program no 2 shouldn't be allowed, anywhere! A single if with an output... Always use WHERE in such situations.

Agree with SASkiwi, based on the limited information in the post.

Is this a real use case, or just an educational question?

If the data set (or DBMS table) has many more values for var3 than the mentioned, a macro approach with WHERE could be more efficient, given that var3 is indexed, or the table is partitioned (i.e. Oracle) or clustered (SPDS) by var3.

Data never sleeps

A data step or macro which is efficient?Give reasons

Re: A data step or macro which is efficient?Give reasons

Re: A data step or macro which is efficient?Give reasons

Re: A data step or macro which is efficient?Give reasons

A data step or macro which is efficient?Give reasons

Re: A data step or macro which is efficient?Give reasons

Re: A data step or macro which is efficient?Give reasons

Re: A data step or macro which is efficient?Give reasons

Registration is open