I would like to calculate a sum, and add a field through the use of hash object
and get the same result (Table Calculated) obtained from the following proc sql:
data input;
input k1 k2 num;
datalines;
1 1 10
1 1 20
1 1 30
2 1 25
2 2 40
2 3 80
run;
proc sql;
create table calculated as
select *,
sum(num) as sum_num,
case when k1=1 then .
when k1=2 then 1
end as new_field
from input
group by k1,k2;
quit;
In order to perform a match, hash needs clear keys. IMHO it can not deal with mnemonics operators like '<'
So you would have to calculate the sum_num and set the corresponding new_field first. Afterwards you can perform your merge with hash.
I would use @Kurt_Bremser's code to perform the first step, while defining the new_field value as requested and load the table into the hash for match merging. - of course only if I expect the time requested for programming to be compensated by run time 😉
DATA help (keep=k1 k2 sum_num new_field);
set input;
retain sum_num;
by k1 k2;
if first.k2 then sum_num = 0;
sum_num + num;
if last.k2 then do;
if k1 eq 1 then new_field=.;
else if k1 eq 2 and sum_num < 50 then new_field=1;
output;
end;
RUN;
DATA calculated2;
length new_field sum_num 8;
drop rc;
if _n_=1 then do;
call missing(new_field, sum_num);
declare hash NF(dataset:'help');
NF.definekey('k1', 'k2');
NF.definedata(all:'YES');
NF.definedone();
end;
set input;
rc=NF.find(key:k1,key:k2);
RUN;
- Cheers -
If you are looking for a non-SQL solution (maybe because of performance problems?), consider this:
data help (keep=k1 k2 sum_num);
set input;
retain sum_num;
by k1 k2;
if first.k2 then sum_num = 0;
sum_num + num;
if last.k2 then output;
run;
data calculated;
merge
input
help
;
by k1 k2;
if k1 = 2 then new_field = 1;
run;
As opposed to the SQL, this also preserves the original order.
%MACRO sum(k1,k2);
PROC SQL;
UPDATE calculated2
SET sum_num=(select sum(num) FROM calculated WHERE k1 eq &k1. AND k2 eq &k2. GROUP BY k1,k2)
WHERE k1 eq &k1. AND k2 eq &k2.
;
QUIT;
%MEND;
DATA calculated2;
length new_field sum_num 8;
drop rc;
if _n_=1 then do;
call missing(new_field);
declare hash NF();
NF.definekey('k1');
NF.definedata('new_field');
NF.definedone();
NF.add(key:1, data:.);
NF.add(key:2, data:1);
end;
set input;
call execute('%'||'sum('||put(k1,best.)||','||put(k2,best.)||');');
rc=NF.find();
RUN;
Nice problem, thanks for asking 🙂
Cheers,
Oligolas
- Cheers -
Thanks!
And if there was a more restrictive condition on the "case when"?
For example:
proc sql;
create table calculated as
select *,
sum(num) as sum_num,
case when k1=1 then .
when k1=2 and calculated sum_num < 50 then 1
end as new_field
from input
group by k1,k2;
quit;
In order to perform a match, hash needs clear keys. IMHO it can not deal with mnemonics operators like '<'
So you would have to calculate the sum_num and set the corresponding new_field first. Afterwards you can perform your merge with hash.
I would use @Kurt_Bremser's code to perform the first step, while defining the new_field value as requested and load the table into the hash for match merging. - of course only if I expect the time requested for programming to be compensated by run time 😉
DATA help (keep=k1 k2 sum_num new_field);
set input;
retain sum_num;
by k1 k2;
if first.k2 then sum_num = 0;
sum_num + num;
if last.k2 then do;
if k1 eq 1 then new_field=.;
else if k1 eq 2 and sum_num < 50 then new_field=1;
output;
end;
RUN;
DATA calculated2;
length new_field sum_num 8;
drop rc;
if _n_=1 then do;
call missing(new_field, sum_num);
declare hash NF(dataset:'help');
NF.definekey('k1', 'k2');
NF.definedata(all:'YES');
NF.definedone();
end;
set input;
rc=NF.find(key:k1,key:k2);
RUN;
- Cheers -
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.