I would like to calculate a sum, and add a field through the use of hash object
and get the same result (Table Calculated) obtained from the following proc sql:
data input;
input k1 k2 num;
datalines;
1 1 10
1 1 20
1 1 30
2 1 25
2 2 40
2 3 80
run;
proc sql;
create table calculated as
select *,
sum(num) as sum_num,
case when k1=1 then .
when k1=2 then 1
end as new_field
from input
group by k1,k2;
quit;
In order to perform a match, hash needs clear keys. IMHO it can not deal with mnemonics operators like '<'
So you would have to calculate the sum_num and set the corresponding new_field first. Afterwards you can perform your merge with hash.
I would use @Kurt_Bremser's code to perform the first step, while defining the new_field value as requested and load the table into the hash for match merging. - of course only if I expect the time requested for programming to be compensated by run time 😉
DATA help (keep=k1 k2 sum_num new_field);
set input;
retain sum_num;
by k1 k2;
if first.k2 then sum_num = 0;
sum_num + num;
if last.k2 then do;
if k1 eq 1 then new_field=.;
else if k1 eq 2 and sum_num < 50 then new_field=1;
output;
end;
RUN;
DATA calculated2;
length new_field sum_num 8;
drop rc;
if _n_=1 then do;
call missing(new_field, sum_num);
declare hash NF(dataset:'help');
NF.definekey('k1', 'k2');
NF.definedata(all:'YES');
NF.definedone();
end;
set input;
rc=NF.find(key:k1,key:k2);
RUN;
- Cheers -
If you are looking for a non-SQL solution (maybe because of performance problems?), consider this:
data help (keep=k1 k2 sum_num);
set input;
retain sum_num;
by k1 k2;
if first.k2 then sum_num = 0;
sum_num + num;
if last.k2 then output;
run;
data calculated;
merge
input
help
;
by k1 k2;
if k1 = 2 then new_field = 1;
run;
As opposed to the SQL, this also preserves the original order.
%MACRO sum(k1,k2);
PROC SQL;
UPDATE calculated2
SET sum_num=(select sum(num) FROM calculated WHERE k1 eq &k1. AND k2 eq &k2. GROUP BY k1,k2)
WHERE k1 eq &k1. AND k2 eq &k2.
;
QUIT;
%MEND;
DATA calculated2;
length new_field sum_num 8;
drop rc;
if _n_=1 then do;
call missing(new_field);
declare hash NF();
NF.definekey('k1');
NF.definedata('new_field');
NF.definedone();
NF.add(key:1, data:.);
NF.add(key:2, data:1);
end;
set input;
call execute('%'||'sum('||put(k1,best.)||','||put(k2,best.)||');');
rc=NF.find();
RUN;
Nice problem, thanks for asking 🙂
Cheers,
Oligolas
- Cheers -
Thanks!
And if there was a more restrictive condition on the "case when"?
For example:
proc sql;
create table calculated as
select *,
sum(num) as sum_num,
case when k1=1 then .
when k1=2 and calculated sum_num < 50 then 1
end as new_field
from input
group by k1,k2;
quit;
In order to perform a match, hash needs clear keys. IMHO it can not deal with mnemonics operators like '<'
So you would have to calculate the sum_num and set the corresponding new_field first. Afterwards you can perform your merge with hash.
I would use @Kurt_Bremser's code to perform the first step, while defining the new_field value as requested and load the table into the hash for match merging. - of course only if I expect the time requested for programming to be compensated by run time 😉
DATA help (keep=k1 k2 sum_num new_field);
set input;
retain sum_num;
by k1 k2;
if first.k2 then sum_num = 0;
sum_num + num;
if last.k2 then do;
if k1 eq 1 then new_field=.;
else if k1 eq 2 and sum_num < 50 then new_field=1;
output;
end;
RUN;
DATA calculated2;
length new_field sum_num 8;
drop rc;
if _n_=1 then do;
call missing(new_field, sum_num);
declare hash NF(dataset:'help');
NF.definekey('k1', 'k2');
NF.definedata(all:'YES');
NF.definedone();
end;
set input;
rc=NF.find(key:k1,key:k2);
RUN;
- Cheers -
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.