SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

A data step or macro which is efficient?Give reasons

Reply
Frequent Contributor
Posts: 89

A data step or macro which is efficient?Give reasons

EfEfficiency:

If the dataset has one million of observations, which one of the following program is more efficient in terms of reducing the cpu time.

Partial dataset:

Dataset:

Var3 Seq1  var1 var2

US 100  10    1

EU 200  20    2

IND 300  30    3

Program No.1

Data US IND EU;

Set dataset;

If var3=‘US’ then output US;

If var3=’EU’ then output EU;

If var3=’IND’ then output IND;

Run;

Program no. 2

%macro report(country);

Data &country;

Set dataset;

If var3= ‘’&country ‘’ then output &country;

Run;

%mend report;

%report(US)

%report(EU)

  %report(IND)

Valued Guide
Posts: 2,177

Re: A data step or macro which is efficient?Give reasons

Posted in reply to venkatnaveen

I expect program 1 would be faster because it reads the input only once.

Both programs write the same three dataset.

Super User
Posts: 3,254

Re: A data step or macro which is efficient?Give reasons

Posted in reply to venkatnaveen

The following logic is quicker also. And if you put the IF statements in order of most frequent then that is faster again (US has more rows than EU and EU has more rows than IND).

If var3=‘US’ then output US;

else If var3=’EU’ then output EU;

else If var3=’IND’ then output IND;


Super User
Posts: 5,430

Re: A data step or macro which is efficient?Give reasons

Posted in reply to venkatnaveen

Program no 2 shouldn't be allowed, anywhere! A single if with an output... Always use WHERE in such situations.

Agree with SASkiwi, based on the limited information in the post.

Is this a real use case, or just an educational question?

If the data set (or DBMS table) has many more values for var3 than the mentioned, a macro approach with WHERE could be more efficient, given that var3 is indexed, or the table is partitioned (i.e. Oracle) or clustered (SPDS) by var3.

Data never sleeps
Ask a Question
Discussion stats
  • 3 replies
  • 634 views
  • 3 likes
  • 4 in conversation