turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Creating Sum Variables For All Combinations Of 9 V...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

09-21-2017 09:25 PM

I am new to the SAS community and appreciate any help you can provide.

I have 9 variables - H1 ... H9. I need to create new variables with the sums of each possible combination of those original 9.

Does anyone have code to complete that task? Thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to whajjar71

09-22-2017 01:18 AM

I take it you want all 2-element sums, 3-elements sums, .... 9-element sum, right?

You could do some nested loops in a data step, but I'd suggest using PROC SUMMARY to generate a data set with all the combinations (from 1-way "combination" to 9-way). Then read that data set and calculate hsum in a single assignment:

```
data have;
id=1;
array h {9} (1,2,4,8,16,32,64,128,256);
output;
id=2;
do i=1 to 9; h{i}=2*h{i};end;
output;
run;
proc summary data=have (keep=id h1-h9 ) completetypes noprint missing chartype ;
by id;
class h1-h9 ;
output out=need / ways;
ways 1 to 9;
run;
data want;
set need (drop=_freq_);
hsum=sum(of h1-h9);
run;
```

Dataset NEED will have, for each ID, 511 observations (=2**9 - 1) with each possible combination of H values. It will also have variables _WAY_ (1 for 1-way combo, 2 for 2-way combo, etc), and _TYPE_. _TYPE_ will be a 9-digit strings of 1's and 0's corresponding to which H vars are present or missing.

In this particular example HSUM will have every integer value from 1 to 511 for ID=1. For ID=2, hsum will have every even value from 2 to 1022.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mkeintz

09-22-2017 09:27 AM

Thanks for the response. One clarification .... I need to execute these summations across all 10,000 observations.

Does that change the code / process having more than one row?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to whajjar71

09-22-2017 10:50 AM

If you take a close look at my example, you'll see that it treats 2 rows, not just one.

But the requirement of this approach is that you need some identifier variable(s) to uniquely identify each row.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to whajjar71

09-23-2017 08:29 AM

That will lead to 2^9 obs for one obs. Are sure you want this ?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to whajjar71

09-23-2017 09:00 AM

```
data have;
infile cards dlm=',';
input h1-h9;
cards;
1,2,4,8,16,32,64,128,256
;
run;
data want;
set have;
array h{*} h1-h9;
array x{*} x1-x9;
k=-1;
do i=1 to 2**dim(x);
rc=graycode(k,of x{*});
sum=0;
do j=1 to dim(x);
sum+x{j}*h{j};
end;
output;
end;
keep h1-h9 sum k;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

09-23-2017 07:08 PM

I hadn't heard of graycode before, so I looked it up on wikipedia. I like the idea of using graycode to step through the combinations, but it would be nicer to avoid looping through all 9 products (h{i}*x{I}) to generate a sum for each graycode iteration.

And it seems that ought to be possible. SAS defines graycode as "generate all combinations of n items * in minimal change order*", and Wikipedia says graycode's intrinsic property is to change only one member of the combination at a time. I.e. each step either adds 1 element to the prior combination, or subtracts one.

That suggests to take full advantage of graycode one could iteratively update sum instead calculating it from scratch (by either adding or subtracting one H value). What I don't immediately see is how to best identify the added or removed element, but one could improve efficiency a lot. Especially for large datasets and long arrays.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mkeintz

09-25-2017 08:49 AM

@mkeintz I agreed with you . That would lead to use SAS/IML . I doubted OP could have product IML.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

09-25-2017 04:44 PM

Here's a relatively simple way to do the task in a data step:

```
data have;
input id h1-h9;
datalines;
1 256 128 64 32 15 8 4 2 1
2 512 256 128 64 32 16 8 4 2
3 1 2 4 8 16 32 64 128 256
4 2 4 8 16 32 64 128 256 512
run;
%let dim=9;
%let ncombo=%eval(2**&dim);
data want (keep=id i h:);
if _n_=1 then do;
%grcode_setup(size=&dim);
end;
set have;
array h{&dim};
hcount=0;
hsum=0;
do i=1 to &ncombo;
hcount= hcount + _graycode_sign{i};
hsum = hsum + h{_graycode_element{i}}*_graycode_sign{i} ;
output;
end;
run;
```

The IF "_n_=1" block calls a macro that iterates via a graycode progression through all combinations of &DIM items. But instead of revising an array of dummies (as in the sas graycode function), it uses the underlying graycode algorithm to build two other arrays, focused on the element to be added to or removed from the combination:

(1) _graycode_element{i}, the element to add or removed at the i'th iteration

(2) _graycode_sign{i}, -1 (remove) or +1 (add) for the i'th iteration

Use these arrays later to identify elements to add/remove to maintain a running HSUM

Also if you actually do want to maintain an array of dummies, as in the graycode function, just add

array dum{&dim} (&dim*0);

prior to the do loop. And inside the do loop add:

dum{_graycode_element{i}} = dum{_graycode_element{i}} + _graycode_sign{i};

Here's the macro being called:

```
%macro grcode_setup(size=);
%local size nc;
%let nc= %eval(2**&size);
array _graycode_element{&nc} _temporary_ (&nc*1);
array _graycode_sign{&nc} _temporary_ (&nc*0);
do _digit=1 to &size;
_d=_digit;
do _i = 2**(_digit-1)+1 to &nc by 2**_digit;
_graycode_element{_i}=_digit;
_graycode_sign{_i}=sign(_d);
_d=-1*_d;
end;
end;
drop _digit _d _i;
%mend grcode_setup;
```

Notice the macro does not populate the sequence from I=1 to &NC. First it identifies all the iterations in which the first element of the array changes. For element 1, the sequence is 0110, meaning it changes every 2nd iteration, starting with iteration 2. For element 2, the sequence is 00111100 (changing every 4th iteration, starting with iteration 3). Etc. Etc.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mkeintz

09-27-2017 05:12 AM

What I don't immediately see is how to best identify the added or removed element, but one could improve efficiency a lot.

RC marks the spot:

data have; infile cards dlm=','; input h1-h9; cards; 1,2,4,8,16,32,64,128,256 ; run; data want2; set have; array h{*} h1-h9; array x{*} x1-x9; k=-1; sum=0; rc=graycode(k,of x{*}); output; do i=1 to 2**dim(x)-1; rc=graycode(k,of x{*}); if x{rc} then sum=sum+h{rc}; else sum=sum-h{rc}; output; end; keep h1-h9 sum k; run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to whajjar71

09-25-2017 03:52 PM

How about for a single obseration with 4 variables instead of 9 you generate the desired output so we can see what you think you want in a more concrete example.

I'm not sure that some of the comments about numbers of variables involved is sinking in and maybe doing this by hand for a smaller set will demonstrate the concerns raised.