Just to add: the probably fastest method might be this:
proc summary data=masterfile;
var price;
class categoryid;
output out=sum1 (drop=_type_ _freq_ rename=(price=categoryrevenue)) sum=;
run;
data sum2;
set sum1;
retain totalrevenue;
if categoryid = .
then do;
totalrevenue = categoryrevenue;
delete;
end;
run;
As this really does only one pass through the original dataset.
With 10 times as many observations as used by @art297, these steps outperformed the SQL 8:16 seconds to 44.53 seconds on my 2-CPU IBM p520.
... View more