Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- new variables through sum of other variables

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 10-21-2021 10:28 AM
(270 views)

Hello everyone, I have a monthly process where I get a table on which I need to perform a transformation that goes by adding the variable M1 and M2 of each branch. The problem is that from month to month I can have more or less branches, so my variables can vary, which makes it difficult for me to create a simple query to add variables having previously declared them. I need some procedure that knows it should sum variable starting with M1 with variable starting with M2, by branch name, also being prepared to changing number of branches (variables) from month to month.

Below I have two tables, which I hope will allow you to understand what my problem is. Additionally, I give you two other tables that will be the desired result.

August Input:

TYPE | M1_BRANCH1 | M1_BRANCH4 | M1_BRANCH9 | M2_BRANCH1 | M2_BRANCH4 | M2_BRANCH9 |

CASH | 10 | 20 | 15 | 15 | 20 | 30 |

August desired Output:

TYPE | BRANCH1 | BRANCH4 | M1_BRANCH9 |

CASH | 25 | 40 | 45 |

September input:

TYPE | M1_BRANCH1 | M1_BRANCH4 | M2_BRANCH1 | M2_BRANCH4 |

CASH | 10 | 20 | 15 | 20 |

September desired Output:

TYPE | M1_BRANCH1 | M1_BRANCH4 |

CASH | 25 | 40 |

Thank you so much for your help.

Corrected the desired output for september.

5 REPLIES 5

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The data structure you have currently is making your task extremely difficult. Having data in the variable name, such as branch and month is the main issue an will likely cause many problems in the long run.

You would be better off to reshape your data for long term use in a form such as

Type Branchname ReportMonth Value(the numbers)

Where your Branchname variable would have values like Branch1 Branch2. The report month would be best for a lot of reasons to be an actual date, such as the first day of the month.

If "M1" "M2" represent a different category then that should also be a different variable hold the value or meaning of M1 M2

And just exactly what are you adding to get those values for September in your example? If we add everything in your example it only totals 75 but you show 80 as the "sum" for M1_Branch4. Example outputs should use example inputs.

Please provide realistic input data what you expect for output for the example. As text Or Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Sorry, i've already corrected the desired output. M1_BRANCH1 should sum with M2_BRANCH1.

I understand your comment regarding the data structure not being adequate, however I am not the owner of the table, so I cannot change the structure of the original table.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You can restructure it using PROC TRANSPOSE then, unless you think you're required to keep it in this structure for some reason? That way you'll be more easily able to sum by months or branch as needed.

Otherwise you're stuck writing formulas such as below where you need to list each number. You could use VNAME to extract the names and then conditionally add them together but that seems a bit painful and you'd at least need to know how many branches to sum.

Flipping the data makes it a much more easy and dynamic calculation. But ultimately your choice of what type of code you want to maintain.

branch1 = sum( of m1_branch1, m2_branch1, m3_branch1, m4_branch1);

Otherwise you're stuck writing formulas such as below where you need to list each number. You could use VNAME to extract the names and then conditionally add them together but that seems a bit painful and you'd at least need to know how many branches to sum.

Flipping the data makes it a much more easy and dynamic calculation. But ultimately your choice of what type of code you want to maintain.

branch1 = sum( of m1_branch1, m2_branch1, m3_branch1, m4_branch1);

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@jmrc wrote:

Sorry, i've already corrected the desired output. M1_BRANCH1 should sum with M2_BRANCH1.

I understand your comment regarding the data structure not being adequate, however I am not the owner of the table, so I cannot change the structure of the original table.

That does not matter. You make a properly structured data set from the garbage someone sends you and them make a report from that. The intermediate steps are up to you.

I have had to work with lots of garbage file formats. As long as the output is as desired and matches the users need they generally don't care what goes on in the background as long as everything is repeatable.

And consider if you have nicely structured collection of data sets when the person providing this to you asks "what about a total over the period from Mar 2019 to May2021" or "I only want to see Branches X., Y and Z" for some report then you are golden and use the data you restructured to use the actual dates and branch values (NOTE: Your example has a terrible problem just waiting to happen with Years and order of "months" [if you are actually thinking of using "September"].)

SAS reporting or analysis procedures can do many things with values. Not so much with variable names containing data.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Each sas table observation (row) has the same number of columns (ID and max number of branches).

I suppose yourinput is probaly excel iported into sas. In such case all tyepes have the same number of barnches, either branch1, branch2 or barnch(n). Ii is possible that the value of a specific type,branch(x) is missing. Remember that the sum result of values, where one of them (or more) is missing - the result is also missing value.

Suppose there are maximum N branches in the input, that is the variables are: M1_branch1, M1_branch2, ..., M1_cranchN and M2_branch1, M2_branch2, ..., M2_cranchN. In such case the simplest code will be:

```
%let N=4; /* maxno of barnches */
data want;
set have;
array m1 {*} M1_branch1 - M1_branch&N;
array m2 {*} M2_branch1 - M2_branch&N;
do i=1 to &N;
m1(i) = m1(i) + m2(i);
end;
keep type M1_branch1 - M1_branch&N;
run;
```

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.