@whs278: What @Reeza said.
However, the answer to the question whether it can be mimicked using the DATA step, the answer is "yes", provided that one is willing to (or can) do a modicum of procedural programming (as opposed to logical set programming done via SQL). Using the hash object, it is also possible to construct a step that does the job without relying on a sorted key order - just like SQL doesn't, either.
However, programming provisions must be made to account for the missing values of analytical variables (in your case, salary) properly, in order to prevent the averages from being corrupted. Note that below, I've added a few extra records with missing values of salary to the sample data to test the null handling. It is done by skipping missing salary records on the aggregation (i.e. first) pass through the data using the WHERE clause. On the second pass, when the accumulated stats are merged back to the original data, this clause is skipped.
data have ;
input jobtype $ salary :dollar. ;
cards ;
Farmer $50,000
Farmer $10,000
Banker $100,000
Banker $300,000
Idler .
Idler .
Clerk $40,000
Clerk .
;
run ;
data want (drop = sum N) ;
if _n_ = 1 then do ;
dcl hash h () ;
h.definekey ("jobtype") ;
h.definedata ("maxsalary", "sum", "N") ;
h.definedone () ;
do until (z) ;
set have (where=(N(salary))) end = z ;
if h.find() ne 0 then call missing (maxsalary, sum, N) ;
sum = sum (sum, salary) ;
N = sum (N, N (salary)) ;
maxsalary = max (maxsalary, salary) ;
h.replace() ;
end ;
end ;
set have ;
_iorc_ = h.find() ;
avgsalary = divide (sum, N) ;
run ;
Note the advantage of using the DIVIDE function compared to the division operator: If the divisor is missing or zero, the function doesn't generate errors and/or warnings but simply an appropriate missing value. That is, if the divisor is 0, it generates the special missing value .I (standing for +infinity) if the numerator is positive and .M (standing for -infinity) if it is negative; if the divisor is a standard missing value (aka "dot-blank"), a standard missing value is generated. (There're other variants listed in the documentation.)
Kind regards
Paul D.
... View more