BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cminard
Obsidian | Level 7

Suppose that you have the following data set:

 

data temp;
	do set = 1 to 4;
		do count = 0 to 5;
			output;
		end;
	end;
run;

Now, you want to get all possible combinations of sets from this data set such that the sum of the "count" is 5. You have to include each "set" one time.

For example, I want:

combination set count
1 1 5
1 2 0
1 3 0
1 4 0
2 1 4
2 2 1
2 3 0
2 4 0
3 1 4
3 2 0
3 3 1
3 4 0
etc etc etc

 

Any suggestions on how to accomplish this would be appreciated. TIA!

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Just to add a data step solution.

 

 

First, if you really have count=0 through 5 for each and every set, then you don't need data set TEMP (as in the proc optmodel solutions):

 

data want (keep=set1-set4);
  sum0=0;
  do set1=0 to 5;
    sum1=set1+sum0; 
    if sum1>5 then continue;
    do set2=0 to 5;
      sum2=sum1+set2;
      if sum2>5 then continue;
      do set3=0 to 5;
        sum3=sum2+set3; 
        if sum3>5 then continue;
        do set4=0 to 5;
          sum4=sum3+set4;  
          if sum4>5 then continue;
          if sum4=5 then output;
        end;
      end;
    end;
  end;
run;

As I said, this solution doesn't require data set TEMP.  But let's say data set TEMP has some "holes", i.e. not every SET has every COUNT from 0 through 5:  Then you need to read TEMP to determine which combinations to skip (the if x{n,setn}=. then continue; statements below):

 

data want (keep=set1-set4);
  array x {4,0:5} _temporary_;
  set temp end=end_of_temp;
  x{set,count}=1;
  if end_of_temp;

  sum0=0;
  do set1=0 to 5;
    if x{1,set1}=. then continue;
    sum1=set1+sum0; 
    if sum1>5 then continue;
    do set2=0 to 5;
      if x{2,set2}=. then continue;
      sum2=sum1+set2;
      if sum2>5 then continue;
      do set3=0 to 5;
        if x{3,set3}=. then continue;
        sum3=sum2+set3; 
        if sum3>5 then continue;
        do set4=0 to 5;
          if x{4,set4}=. then continue;
          sum4=sum3+set4;  
          if sum4>5 then continue;
          if sum4=5 then output;
        end;
      end;
    end;
  end;
run;

Conversion to macro code can shorten the above.  And it would be more amenable to a larger number of sets (just change the %let nsets=4; statement below), or a change in the count range, or target value:  

 

%macro test;
  %let nsets=4;
  %let countrange=0:5;
  %let target=5;

  data want (keep=set1-set&nsets);
    array x {&nsets,&countrange} _temporary_;
    set temp end=end_of_temp;
    x{set,count}=1;
    if end_of_temp;

    sum0=0;
    %do s=1 %to &nsets;
      do set&s = lbound(x,2) to hbound(x,2);
        if x{&s,set&s}=. then continue;
        sum&s=set&s + sum%eval(&s-1);
        if sum&s>&target then continue;
    %end;

      if sum&nsets=&target then output;

    %do s=1 %to &nsets;
      end;
    %end;
  run;
%mend test;
%test;

 

Note

  the statement

  sum&s=set&s + sum%eval(&s-1);

  is converted to

   sum1=set1 + sum0;
   sum2=set2 + sum1;
 

etc.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

6 REPLIES 6
ballardw
Super User

I am not seeing how your "want" complies with " sum of the "count" is 5." . You likely have to go into a bit more detail of exactly what "You have to include each "set" one time." means as well.

 

If you mean that you have other data than your example data set it is quite possible the "sum of count=5" may not occur for "set". So what would be the rule for selecting the one that does not match.

cminard
Obsidian | Level 7

In the example data set I have:

Set 1: 5+0+0+0=5

Set 2: 4+1+0+0=5

Set 3: 4+0+1+0=5

 

No, there is no other data.

mkeintz
PROC Star

Just to add a data step solution.

 

 

First, if you really have count=0 through 5 for each and every set, then you don't need data set TEMP (as in the proc optmodel solutions):

 

data want (keep=set1-set4);
  sum0=0;
  do set1=0 to 5;
    sum1=set1+sum0; 
    if sum1>5 then continue;
    do set2=0 to 5;
      sum2=sum1+set2;
      if sum2>5 then continue;
      do set3=0 to 5;
        sum3=sum2+set3; 
        if sum3>5 then continue;
        do set4=0 to 5;
          sum4=sum3+set4;  
          if sum4>5 then continue;
          if sum4=5 then output;
        end;
      end;
    end;
  end;
run;

As I said, this solution doesn't require data set TEMP.  But let's say data set TEMP has some "holes", i.e. not every SET has every COUNT from 0 through 5:  Then you need to read TEMP to determine which combinations to skip (the if x{n,setn}=. then continue; statements below):

 

data want (keep=set1-set4);
  array x {4,0:5} _temporary_;
  set temp end=end_of_temp;
  x{set,count}=1;
  if end_of_temp;

  sum0=0;
  do set1=0 to 5;
    if x{1,set1}=. then continue;
    sum1=set1+sum0; 
    if sum1>5 then continue;
    do set2=0 to 5;
      if x{2,set2}=. then continue;
      sum2=sum1+set2;
      if sum2>5 then continue;
      do set3=0 to 5;
        if x{3,set3}=. then continue;
        sum3=sum2+set3; 
        if sum3>5 then continue;
        do set4=0 to 5;
          if x{4,set4}=. then continue;
          sum4=sum3+set4;  
          if sum4>5 then continue;
          if sum4=5 then output;
        end;
      end;
    end;
  end;
run;

Conversion to macro code can shorten the above.  And it would be more amenable to a larger number of sets (just change the %let nsets=4; statement below), or a change in the count range, or target value:  

 

%macro test;
  %let nsets=4;
  %let countrange=0:5;
  %let target=5;

  data want (keep=set1-set&nsets);
    array x {&nsets,&countrange} _temporary_;
    set temp end=end_of_temp;
    x{set,count}=1;
    if end_of_temp;

    sum0=0;
    %do s=1 %to &nsets;
      do set&s = lbound(x,2) to hbound(x,2);
        if x{&s,set&s}=. then continue;
        sum&s=set&s + sum%eval(&s-1);
        if sum&s>&target then continue;
    %end;

      if sum&nsets=&target then output;

    %do s=1 %to &nsets;
      end;
    %end;
  run;
%mend test;
%test;

 

Note

  the statement

  sum&s=set&s + sum%eval(&s-1);

  is converted to

   sum1=set1 + sum0;
   sum2=set2 + sum1;
 

etc.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
FreelanceReinh
Jade | Level 19

Hello @cminard,

 

You can create a Cartesian product (view) with PROC SQL and then transpose the result with a DATA step.

proc sql;
create view _tmp as
select t1.count as a1,
       t2.count as a2,
       t3.count as a3,
       t4.count as a4
from temp(where=(set=1)) t1,
     temp(where=(set=2)) t2,
     temp(where=(set=3)) t3,
     temp(where=(set=4)) t4
having a1+a2+a3+a4=5
order by 1 desc, 2 desc, 3 desc, 4 desc;
quit;

data want(drop=a:);
set _tmp;
array a[*] a:;
combination=_n_;
do set=1 to dim(a);
  count=a[set];
  output;
end;
run;
Ksharp
Super User

It is OR things. Try post it at OR forum and calling @RobPratt 

 

 

proc optmodel;
num numLists=4;
set LIST{1..numLists};
LIST[1]=0..5;
LIST[2]=0..5;
LIST[3]=0..5;
LIST[4]=0..5;

num target=5;
var x{i in 1..numLIsts,j in LIST[i]} binary;
con oneitemperlist{i in 1..numLists} : sum{j in LIST[i]} x[i,j]=1;
con sumtotarget:sum{i in 1..numLists,j in LIST[i]} j*x[i,j]=target;
solve with clp/findallsolns;
set SUPPORT{s in 1.._NSOL_}={i in 1..numLists,j in LIST[i]:x[i,j].sol[s]>0.5};
put SUPPORT[*];
quit;

 

 

 

 

{<1,5>,<2,0>,<3,0>,<4,0>} {<1,4>,<2,0>,<3,0>,<4,1>} {<1,4>,<2,1>,<3,0>,<4,0>} {<1,4>,<2,0>,<3,1>,<4,0>} {<1,3>,<2,1>,<3,0>,<4,1>} {<
 1,3>,<2,0>,<3,1>,<4,1>} {<1,3>,<2,1>,<3,1>,<4,0>} {<1,3>,<2,0>,<3,0>,<4,2>} {<1,3>,<2,2>,<3,0>,<4,0>} {<1,3>,<2,0>,<3,2>,<4,0>} {<1,
 2>,<2,1>,<3,1>,<4,1>} {<1,2>,<2,1>,<3,0>,<4,2>} {<1,2>,<2,0>,<3,1>,<4,2>} {<1,2>,<2,2>,<3,0>,<4,1>} {<1,2>,<2,2>,<3,1>,<4,0>} {<1,2>
 ,<2,0>,<3,2>,<4,1>} {<1,2>,<2,1>,<3,2>,<4,0>} {<1,2>,<2,0>,<3,3>,<4,0>} {<1,2>,<2,3>,<3,0>,<4,0>} {<1,2>,<2,0>,<3,0>,<4,3>} {<1,1>,<
 2,1>,<3,1>,<4,2>} {<1,1>,<2,2>,<3,0>,<4,2>} {<1,1>,<2,2>,<3,1>,<4,1>} {<1,1>,<2,0>,<3,2>,<4,2>} {<1,1>,<2,2>,<3,2>,<4,0>} {<1,1>,<2,
 1>,<3,2>,<4,1>} {<1,1>,<2,0>,<3,3>,<4,1>} {<1,1>,<2,1>,<3,3>,<4,0>} {<1,1>,<2,3>,<3,0>,<4,1>} {<1,1>,<2,3>,<3,1>,<4,0>} {<1,1>,<2,0>
 ,<3,1>,<4,3>} {<1,1>,<2,1>,<3,0>,<4,3>} {<1,1>,<2,0>,<3,4>,<4,0>} {<1,1>,<2,0>,<3,0>,<4,4>} {<1,1>,<2,4>,<3,0>,<4,0>} {<1,0>,<2,2>,<
 3,1>,<4,2>} {<1,0>,<2,1>,<3,2>,<4,2>} {<1,0>,<2,2>,<3,2>,<4,1>} {<1,0>,<2,0>,<3,3>,<4,2>} {<1,0>,<2,1>,<3,3>,<4,1>} {<1,0>,<2,2>,<3,
 3>,<4,0>} {<1,0>,<2,3>,<3,0>,<4,2>} {<1,0>,<2,3>,<3,1>,<4,1>} {<1,0>,<2,3>,<3,2>,<4,0>} {<1,0>,<2,2>,<3,0>,<4,3>} {<1,0>,<2,0>,<3,2>
 ,<4,3>} {<1,0>,<2,1>,<3,1>,<4,3>} {<1,0>,<2,1>,<3,4>,<4,0>} {<1,0>,<2,0>,<3,4>,<4,1>} {<1,0>,<2,0>,<3,1>,<4,4>} {<1,0>,<2,1>,<3,0>,<
 4,4>} {<1,0>,<2,4>,<3,1>,<4,0>} {<1,0>,<2,4>,<3,0>,<4,1>} {<1,0>,<2,0>,<3,0>,<4,5>} {<1,0>,<2,0>,<3,5>,<4,0>} {<1,0>,<2,5>,<3,0>,<4,
 0>}
解汇总
Solver	CLP
变量选择	MINR
Objective Function	(0)
Solution Status	All Solutions Found
Objective Value	0
 	 
Solutions Found	56
Presolve Time	0.00
Solution Time	0.01

 

RobPratt
SAS Super FREQ

That works, or you can use 4 integer variables and 1 linear constraint:

proc optmodel;
   num numLists = 4;
   num target = 5;
   var X {1..numLists} >= 0 <= target integer;
   con SumToTarget: sum {i in 1..numLists} X[i] = target;
   solve with clp / findallsolns;
   create data want from [combination set]={s in 1.._NSOL_, i in 1..numLists} count=X[i].sol[s];
quit;
proc sort data=want;
   by combination set;
run;

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 944 views
  • 2 likes
  • 6 in conversation