Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Programming
- /
- Using formulas to sum values between rows

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

☑ This topic is **solved**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 05-01-2023 06:18 PM
(285 views)

I have a dataset below. The formula is just summing each value in the MemberCount column. I thought about just creating a new variable for each value but is there an easier way to do this? Thanks!

IndicatorKey | MemberCount | Formula | Calc |

201820_20 | 323 | (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount) | 1800 |

201822_20 | 560 | (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount) | 1800 |

201824_20 | 257 | (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount) | 1800 |

201828_20 | 660 | (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount) | 1800 |

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
data have;
infile cards expandtabs truncover;
input IndicatorKey :$20. MemberCount Formula $100.;
cards;
201820_20 323 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
201822_20 560 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
201824_20 257 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
201828_20 660 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
;
data want;
if _n_=1 then do;
if 0 then set have(rename=(MemberCount=_MemberCount));
declare hash h(dataset:'have(rename=(MemberCount=_MemberCount))');
h.definekey('IndicatorKey');
h.definedata('_MemberCount');
h.definedone();
end;
set have;
want=0;
do i=1 to countw(Formula,'_','kd');
if h.find(key:scan(Formula,i,'_','kd'))=0 then want+_MemberCount;
end;
drop i _MemberCount;
run;
```

8 REPLIES 8

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Your formulas all appear to be identical, so I see no reason to parse the formula row by row instead of making a new variable with the single, repeated, formula in your program code.

But also, I do not understand your formula, which seems to be 5 colon-separated expressions. How would they be interpreted?

--------------------------

The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for

Allow PROC SORT to output multiple datasets

--------------------------

The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for

Allow PROC SORT to output multiple datasets

--------------------------

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

If you want to aggregate by the value of variable like INDICATORKEY then just tell SAS to do that directly. No need for a "formula".

```
proc means ;
class indicatorkey;
var membercount;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

If the example you provide is representative, then you want the sum of calc over all obs that have the same formula. And for each distinct formula, the summation includes all the corresponding calc values and nothing else.

And if the data are already physically grouped by formula, then this would work:

```
data want;
do until (last.formula);
set have;
by formula notsorted;
calc=sum(calc,membercount);
end;
do until (last.formula);
set have;
by formula notsorted;
output;
end;
run;
```

Now this does create a new variable, but it's a trivial effort.

--------------------------

The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for

Allow PROC SORT to output multiple datasets

--------------------------

The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for

Allow PROC SORT to output multiple datasets

--------------------------

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
data have;
infile cards expandtabs truncover;
input IndicatorKey :$20. MemberCount Formula $100.;
cards;
201820_20 323 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
201822_20 560 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
201824_20 257 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
201828_20 660 (201820_20:MemberCount+201822_20:MemberCount+201824_20:MemberCount+201828_20:MemberCount)
;
data want;
if _n_=1 then do;
if 0 then set have(rename=(MemberCount=_MemberCount));
declare hash h(dataset:'have(rename=(MemberCount=_MemberCount))');
h.definekey('IndicatorKey');
h.definedata('_MemberCount');
h.definedone();
end;
set have;
want=0;
do i=1 to countw(Formula,'_','kd');
if h.find(key:scan(Formula,i,'_','kd'))=0 then want+_MemberCount;
end;
drop i _MemberCount;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you! This worked!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Yikes. What a complicated process for such a simple task.

```
proc sql ;
create table want2 as
select *
, sum(membercount) as want
from have
group by formula
order by 1
;
quit;
```

Here is result of comparing WANT to WANT2.

Observation Summary Observation Base Compare First Obs 1 1 Last Obs 4 4 Number of Observations in Common: 4. Total Number of Observations Read from WORK.WANT: 4. Total Number of Observations Read from WORK.WANT2: 4. Number of Observations with Some Compared Variables Unequal: 0. Number of Observations with All Compared Variables Equal: 4. NOTE: No unequal values were found. All values compared are exactly equal.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Tom,

I think formula is changing dynamically , not just for all the obs. Maybe OP should explain more details about his question?

I think formula is changing dynamically , not just for all the obs. Maybe OP should explain more details about his question?

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.