DATA Step, Macro, Functions and more

Alternate to max function

Reply
Super Contributor
Posts: 321

Alternate to max function

Hi experts,

Is there an alternate(and more efficient) way on getting the max value for a particular column? currently i'm using max function on proc sql.

Thanks!
Milton
N/A
Posts: 1

Re: Alternate to max function

I think the MEANS procedure is nice enough for it.
Super User
Posts: 3,115

Re: Alternate to max function

My experience has been that SQL runs faster for deriving a few basic statistics like mean, sum, max with simple grouping, even if the code is longer. There is a significant overhead just invoking the MEANS procedure so for simple statistics and single level grouping it is usually slower.

However MEANS is definitely a better choice for more complicated grouping and statistics.
Super User
Posts: 5,260

Re: Alternate to max function

What costs in this operation is to do a table scan to find out the maximum value. So I don't think that any technique is more efficient than any other.

In SPD Server (and other RDBMS?), if your max column is indexed, the query will get the answer by just analyse the index meta data. On large tables, this could boost performance with thousands of percent... It's a pity that this hasn't been implemented (yet) for Base SAS engine/SQL, since the information in the index is already available.

/Linus
Data never sleeps
Super User
Posts: 9,691

Re: Alternate to max function

try to use proc sort;
Such as:
[pre]
proc sort data=yourdataset;
by descending yourvar;
run;
[/pre]



Good Luck.

Message was edited by: Ksharp Message was edited by: Ksharp
Super User
Posts: 3,115

Re: Alternate to max function

While Linus is correct in that no one technique is necessarily more efficient than another, I stand by my comment that the MEANS procedure has a significant invocation overhead compared to SQL.

Therefore simple summary stats in SQL like for example:

proc sql;
create table sum as
mean(age),
max(height
from sashelp.class
group by sex;

can often be more efficient than the procedure. Please note this is only worth pursuing for large datasets where you have performance issues and you do have to benchmark your methods for ultimate proof.
Ask a Question
Discussion stats
  • 5 replies
  • 165 views
  • 0 likes
  • 5 in conversation