I have a dataset below. The formula is just summing each value in the MemberCount column. I thought about just creating a new variable for each value but is there an easier way to do this? Thanks!
IndicatorKey | MemberCount | Formula | Calc |
201820_20 | 323 | (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount) | 1800 |
201822_20 | 560 | (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount) | 1800 |
201824_20 | 257 | (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount) | 1800 |
201828_20 | 660 | (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount) | 1800 |
data have;
infile cards expandtabs truncover;
input IndicatorKey :$20. MemberCount Formula $100.;
cards;
201820_20 323 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
201822_20 560 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
201824_20 257 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
201828_20 660 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
;
data want;
if _n_=1 then do;
if 0 then set have(rename=(MemberCount=_MemberCount));
declare hash h(dataset:'have(rename=(MemberCount=_MemberCount))');
h.definekey('IndicatorKey');
h.definedata('_MemberCount');
h.definedone();
end;
set have;
want=0;
do i=1 to countw(Formula,'_','kd');
if h.find(key:scan(Formula,i,'_','kd'))=0 then want+_MemberCount;
end;
drop i _MemberCount;
run;
Your formulas all appear to be identical, so I see no reason to parse the formula row by row instead of making a new variable with the single, repeated, formula in your program code.
But also, I do not understand your formula, which seems to be 5 colon-separated expressions. How would they be interpreted?
This is just one formula I have in a large dataset where the formula variable includes like 50 different formulas. I just made this table to simplify my question. I have SAS correctly referencing all of the other formulas using call execute but this one is giving me trouble. I want SAS to reference the correct value. "201820_20:MemberCount" is referencing the value of MemberCount for the IndicatorKey=201820_20 which would be 323 (the first row) and so on and so forth. So the final column Calc should just add up each row to 1800 (323+560+257+660).
If you want to aggregate by the value of variable like INDICATORKEY then just tell SAS to do that directly. No need for a "formula".
proc means ;
class indicatorkey;
var membercount;
run;
If the example you provide is representative, then you want the sum of calc over all obs that have the same formula. And for each distinct formula, the summation includes all the corresponding calc values and nothing else.
And if the data are already physically grouped by formula, then this would work:
data want;
do until (last.formula);
set have;
by formula notsorted;
calc=sum(calc,membercount);
end;
do until (last.formula);
set have;
by formula notsorted;
output;
end;
run;
Now this does create a new variable, but it's a trivial effort.
data have;
infile cards expandtabs truncover;
input IndicatorKey :$20. MemberCount Formula $100.;
cards;
201820_20 323 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
201822_20 560 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
201824_20 257 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
201828_20 660 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
;
data want;
if _n_=1 then do;
if 0 then set have(rename=(MemberCount=_MemberCount));
declare hash h(dataset:'have(rename=(MemberCount=_MemberCount))');
h.definekey('IndicatorKey');
h.definedata('_MemberCount');
h.definedone();
end;
set have;
want=0;
do i=1 to countw(Formula,'_','kd');
if h.find(key:scan(Formula,i,'_','kd'))=0 then want+_MemberCount;
end;
drop i _MemberCount;
run;
Yikes. What a complicated process for such a simple task.
proc sql ;
create table want2 as
select *
, sum(membercount) as want
from have
group by formula
order by 1
;
quit;
Here is result of comparing WANT to WANT2.
Observation Summary Observation Base Compare First Obs 1 1 Last Obs 4 4 Number of Observations in Common: 4. Total Number of Observations Read from WORK.WANT: 4. Total Number of Observations Read from WORK.WANT2: 4. Number of Observations with Some Compared Variables Unequal: 0. Number of Observations with All Compared Variables Equal: 4. NOTE: No unequal values were found. All values compared are exactly equal.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.