Hi,
Im trying to Create all possible combinations for 66C8.
%let n=66;
%let k=8;
data Comb1 (keep=c1-c&k);
array c[&k] (&k.*0); /* initialize array to 0 */
ncomb = comb(&n, &k); /* number of combinations */
do j = 1 to ncomb;
rc = lexcombi(&n, &k, of c[*]);
end;
run;
Unfortunately my hardward is unable to do this much. i also dont have proc IML.
can any one suggets a quick work around?
That's 7.4 billion (7,392,009,768) records...
If you want a dataset with all combinations no matter what you do that's a frickin big data set. The optimization would probably be on what you plan to do with it next, as is creating a big dataset is creating a big dataset.
You can use the CALL ALLCOMB routine, if you really want to do it.
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a003112302.htm
The example is pretty straightforward. You could write it to a text file for use later, if that helps with the space issues?
abit less than that but yes the problem is its too big,
the previous code i used did use lexicomp which i think is a bit similar but i got I/O Issues.
each number from 1 to 66 is assigned a value and we want to see sum of the 8 values modulo 8 then see frequency of the remainder
Hi @Tj_chua,
With your data step you'll end up with only one observation: c1=59, c2=60, ..., c8=66. This is manageable in terms of size. On my workstation the step took 90 seconds. So, even run time should not be an issue. What kind of I/O issues did you experience?
Just as an example, I computed the frequencies of the remainders modulo 8 of sum(of c[*]), which took less than five minutes. (Indeed, comb(66,8) is only about 5.7E9.) Also, it should be possible to exploit, e.g., the fact that
mod(a+b, 8) = mod(mod(a, 8)+mod(b, 8), 8)
for all integers a, b.
With this kind of mathematical reasoning you might need less brute force computing.
Even in the worst case scenario where nothing else seems to work, you can program this yourself:
do i1=1 to 59;
do i2=i1+1 to 60;
do i3=i2+1 to 61;
do i4=i3+1 to 62;
do i5=i4+1 to 63;
do i6=i5+1 to 64;
do i7=i6+1 to 65;
do i8=i7+1 to 66;
* sum up c{i1} + c{i2} + ... + c{i8};
end; end; end; end; end; end; end; end;
It's clumsier, but if you're careful there won't be any accidents.
Hi @Tj_chua,
I've just written a draft program which, according to preliminary checks, can solve your problem in 0.1 seconds (as opposed to about 4:45 minutes for the brute force approach).
You mentioned 66 "values", which I assumed to be integers. The input for the below program is just the frequency distribution of the remainders modulo 8 of these 66 values. (Obviously, program code could be added to calculate this quickly.) The frequencies of remainders 0, 1, ..., 7 are the initial values of temporary array m.
As an example, I took the values 1, 2, ..., 66. Eight of them have remainder 0, nine have remainder 1 etc.
data comb0;
array m[0:7] _temporary_ (8 9 9 8 8 8 8 8);
array n[0:7] _temporary_;
array k[8];
do k1=0 to 7;
do k2=k1 to 7;
do k3=k2 to 7;
do k4=k3 to 7;
do k5=k4 to 7;
do k6=k5 to 7;
do k7=k6 to 7;
do k8=k7 to 7;
r=mod(sum(of k[*]),8);
f=1;
do i=0 to 7;
n[i]=0;
do j=1 to 8;
n[i]+(k[j]=i);
end;
if m[i]>=n[i] then f=f*comb(m[i],n[i]);
else do;
f=0;
leave;
end;
end;
if f then output;
end;
end;
end;
end;
end;
end;
end;
end;
drop i j k:;
run;
proc summary data=comb0 nway;
class r;
var f;
output out=want(drop=_:) sum=;
run;
proc print data=want noobs;
sum f;
run;
The trick is to partition the set of comb(66, 8)=5.7 billion combinations into comb(8+8-1, 8)=6435 subsets in which the target function r is constant and whose cardinalities f can be calculated by means of elementary combinatorics.
More explanations will follow tomorrow (CEST).
this looks good but im not that familiar to arrays,
would you be able to help me to do the following.
so there are 66 possible items
1-8 is assiged with value of 1
13-16 is assiged with value of 2
21-24 is assiged with value of 3
25-32 is assiged with value of 4
33-40 is assiged with value of 5
41-48 is assiged with value of 6
49-56 is assiged with value of 7
57-64 is assiged with value of 8
65&66 is assiged with value of 0.3
we want to take 8 a at a time and see the frequency of remainder.
The catch is, no combination of items can be repeated. (combination not permutation, meaning each item can be drawn only once in each combination.)
would it be easy to tweak the codes below for this purpose?
Thank you so much!
The 0.3 will require special consideration. What values are assigned to items 9-12 and 17-20?
We are talking about 8-element subsets of a 66-element set. So, your requirement that no item should be repeated within a subset is satisfied.
@FreelanceReinh ops i missed a few there
1-8 is assiged with value of 1
9-16 is assiged with value of 2
17-24 is assiged with value of 3
25-32 is assiged with value of 4
33-40 is assiged with value of 5
41-48 is assiged with value of 6
49-56 is assiged with value of 7
57-64 is assiged with value of 8
65&66 is assiged with value of 0.3
Thanks!
No problem. I already guessed the missing values.
Meanwhile I have adapted my program to the new information:
data comb0;
array m[0:8] _temporary_ (8 8 8 8 8 8 8 8 2);
array n[0:8] _temporary_;
array l[8] _temporary_;
array k[8];
do k1=0 to 8;
do k2=k1 to 8;
do k3=k2 to 8;
do k4=k3 to 8;
do k5=k4 to 8;
do k6=k5 to 8;
do k7=k6 to 8;
do k8=k7 to 8;
do _n_=1 to 8;
l[_n_]=ifn(k[_n_]=8, 0.3, k[_n_]);
end;
r=mod(sum(of l[*]),8);
f=1;
do i=0 to 8;
n[i]=0;
do j=1 to 8;
n[i]+(k[j]=i);
end;
if m[i]>=n[i] then f=f*comb(m[i],n[i]);
else do;
f=0;
leave;
end;
end;
if f then output;
end;
end;
end;
end;
end;
end;
end;
end;
drop i j k:;
run;
proc summary data=comb0 nway;
class r;
var f;
output out=want(drop=_:) sum=;
run;
proc print data=want noobs;
sum f;
run;
Total run time is still <0.1 s, although the number of observations in dataset COMB0 has increased from 6435 to 11583. But this is still much better than 5.7 billion.
I am going to add explanations to this post within the next few hours. Just wanted to deliver the code first. In parallel I run the program using the brute force approach, which I have also amended to match the new specifications. Estimated run time: 30 min.
Edit: The explanations have become a bit lengthy. I'll put them into a separate post to avoid technical issues.
Explanations
Hi @FreelanceReinh,
I think i explain something wrong, we want to form groups of 8s with no remainders, so( 6 6 6 6 6 6 6 6), there are no eights that can be formed. while if we get (2 6 6 6 6 6 6 6) 1 eight can be formed.
This is greate! is it also possible to get below
1)let x - how many 8 was formed,
2)let y - how many items formed the remainder/ Was not used? istead of the remainder it self, preffer lest amount of item possible.
3)let z - how many 0.3 in the combination.
Example:
Combination x Y Z
4 4 4 2 2 1 3 .3 => 2 eights formed, 3 Items remained. remainder = 1. count for .3 items is 1
.3 .3 3 5 1 2 5 8=> 3 eights formed, 2 Items remained. remainder = .6. count for .3 items is 2
4 4 3 5 1 2 5 8=> 4 eights formed, 0 Items remained. remainder = 0. count for .3 items is 0
4 4 4 4 1 2 1 1 => 4 eights formed, 2 Items remained. remainder = 5. count for .3 items is 0
X Y ZFreq
1 0 0 ???
1 1 1 ???
1 2 2 ???
.
.
Is this possible with the array?
Thank you so much, appriciate the help.
ill try to learn the explanation code.
Hi @Tj_chua,
It seems that your specifications change faster than I can implement them :-).
Here is the implementation of the specs of 2016-05-17 08:54 AM UTC, which were:
@Tj_chua wrote:
Hi @FreelanceReinh,
This is greate! is it also possible to get distribution of bellow
1)let x - how many 8 was formed, i think round(sum/8) will do.
2)let y - how many items formed the remainder? istead of the remainder it self, preffer lest amount of item possible.
3)let z - how many 0.3 in remainder.
Example:
Combination
4 4 4 2 2 1 3 .3 => 2 eights formed, 3 Items remained. remainder = 1. count for .3 items is 1
.3 .3 3 5 1 2 5 8=> 3 eights formed, 2 Items remained. remainder = .6. count for .3 items is 2
4 4 3 5 1 2 5 8=> 4 eights formed, 0 Items remained. remainder = 0. count for .3 items is 0
4 4 4 4 1 2 1 1 => 4 eights formed, 2 Items remained. remainder = 5. count for .3 items is 0
X Y ZFreq
1 0 0 ???
1 1 1 ???
1 2 2 ???
.
.
Thank you so much, appriciate the help.ill try to learn the explanation code.
%macro ymac;
%local i y;
%do y=2 %to 7;
t[1]=0;
call missing(of s[*]);
ncomb=comb(8, &y);
do j=1 to ncomb;
call allcombi(8, &y, of %do i=1 %to &y; t[&i] %end;);
do h=1 to &y;
s[h]=v[t[h]];
end;
if round(sum(of s[*]),.1)=r then do;
lflag=1;
leave;
end;
end;
if lflag then return;
y=y+1;
%end;
%mend ymac;
data comb0;
array m[0:8] _temporary_ (8 8 8 8 8 8 8 8 2);
array n[0:8] _temporary_;
array l[8] _temporary_;
array v[8] _temporary_;
array k[8];
array s[8] _temporary_;
array t[8] _temporary_;
do k1=0 to 8;
do k2=k1 to 8;
do k3=k2 to 8;
do k4=k3 to 8;
do k5=k4 to 8;
do k6=k5 to 8;
do k7=k6 to 8;
do k8=k7 to 8;
do _n_=1 to 8;
l[_n_]=ifn(k[_n_]=8, 0.3, k[_n_]);
v[_n_]=ifn(k[_n_]=0, 8, l[_n_]);
end;
r=round(mod(sum(of l[*]),8),.1);
f=1;
do i=0 to 8;
n[i]=0;
do j=1 to 8;
n[i]+(k[j]=i);
end;
if m[i]>=n[i] then f=f*comb(m[i],n[i]);
else do;
f=0;
leave;
end;
end;
if f then do;
x=floor(sum(of v[*])/8);
lflag=0;
if r=0 then y=0;
else if r in v then y=1;
else do;
y=2;
link callymac;
end;
if y=8 & sum(of v[*]) ne r then y=.N;
z=n[8];
output;
end;
end;
end;
end;
end;
end;
end;
end;
end;
return;
callymac:
%ymac
return;
drop h i j k: l: ncomb;
run;
/* Use PROC FREQ for aggregation, thus get percentages without additional effort */
proc freq data=comb0 noprint;
weight f;
tables r / out=want;
tables x*y*z / missing out=want2;
run;
proc print data=want;
sum count percent;
run;
proc print data=want2;
sum count percent;
run;
Explanations
@Tj_chua wrote:
Hi @FreelanceReinh,
I think i explain something wrong, we want to form groups of 8s with no remainders, so( 6 6 6 6 6 6 6 6), there are no eights that can be formed. while if we get (2 6 6 6 6 6 6 6) 1 eight can be formed.
This is greate! is it also possible to get below
1)let x - how many 8 was formed,
2)let y - how many items formed the remainder/ Was not used? istead of the remainder it self, preffer lest amount of item possible.
3)let z - how many 0.3 in the combination.
Example:
Combination x Y Z
4 4 4 2 2 1 3 .3 => 2 eights formed, 3 Items remained. remainder = 1. count for .3 items is 1
.3 .3 3 5 1 2 5 8=> 3 eights formed, 2 Items remained. remainder = .6. count for .3 items is 2
4 4 3 5 1 2 5 8=> 4 eights formed, 0 Items remained. remainder = 0. count for .3 items is 0
4 4 4 4 1 2 1 1 => 4 eights formed, 2 Items remained. remainder = 5. count for .3 items is 0
X Y ZFreq
1 0 0 ???1 1 1 ???
1 2 2 ???
.
.
Is this possible with the array?
Thank you so much, appriciate the help.
ill try to learn the explanation code.
Your latest specifications abandon the concept of calculating sums and remainders modulo 8, which was fundamental to the development done so far. Nevertheless, there is still hope that much of the existing program could continue to be used. This is based on two facts:
Unfortunately, the newly requested variable x is not well-defined.
Example:
values
(1 1 2 3 4 5 6 7) --> x=2?
(1 1 2 3 4 5 6 7) --> x=3?
Each of our 11583 eight-element subsets can be partitioned in 4140 different ways (Bell number B8). So, even if one had to search through all 11583*4140=47,953,620 partitions to determine the maximum (or minimum) number of eights formed (i.e., a well-defined version of x), this should be feasible.
I reckon that SAS/OR provides better tools for performing such searches (in an optimized way), but I don't have a license, nor experience with this SAS module.
Therefore, I recommend that you open a new thread with updated specifications (all derived variables must be well-defined) and an adapted title. Several "SAS/OR challenges" have been answered very quickly in the past couple of months, for example this one.
Good luck! And thank you for posting such interesting questions for mathematically inclined programmers.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.