DATA Step, Macro, Functions and more

Generate all Combinations

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Generate all Combinations

[ Edited ]

Trying to generate a list of all possible combinations using the following data:

 

Food GroupNamePoints
GrainsBrown Rice2
GrainsQuinoa2
GrainsWhite Rice3
GrainsEnglish Muffin4
FruitsWatermelon3
FruitsApple2
FruitsPeach1
FruitsGrapes4
FruitsCanned Fruit4
FruitsPear1
FruitsMango2
Vegetables/LegumesIceberg Lettuce2
Vegetables/LegumesSquash2
Vegetables/LegumesEggplant1
Vegetables/LegumesKale3
Vegetables/LegumesRed Cabbage4
Vegetables/LegumesAsparagus3
Vegetables/LegumesSweet Potato2
Vegetables/LegumesWhite Potato1
Vegetables/LegumesCorn on Cob1
Vegetables/LegumesRed Lentils4
Vegetables/LegumesTomato2
Vegetables/LegumesBok Choy3
Vegetables/LegumesFrozen Vegetables1
Vegetables/LegumesRed Onion2
SweetsIce Cream1
SweetsPudding1
SweetsDark Chocolate4
SweetsMilk Chocolate3
Lean MeatsChicken2
Lean MeatsFish4
Lean MeatsGround Turkey2
Lean MeatsFried Fish1

 

My exercise is to generate all of the combinations where I have 1 Grain, 2 Fruit, 3 Vegetables, 1 Sweet, 1 Lean Meat where total points is less than or equal to 25.

 

I do not have access to PROC IML.

 

Thanks!


Accepted Solutions
Solution
‎09-18-2017 01:32 PM
Trusted Advisor
Posts: 1,294

Re: Generate all Combinations

Posted in reply to crawdaddy

Without getting in into how to do this in SAS, ... as a strategy how about:

 

  1. Make 5 datasets, one per group
    1. Vegetable dataset: All 3-element combinations of Vegetables
    2. Fruits: All 2-element combinations of Fruits
    3. etc. etc.
  2. Do a full cartesian crossing of the 5 datasets.

 

 

 

View solution in original post


All Replies
Respected Advisor
Posts: 4,569

Re: Generate all Combinations

Posted in reply to crawdaddy

@crawdaddy

As this is an excercise we possibly shouldn't just give you the full solution. 

What solution approaches have you considered and what have you tried already?

 

Trusted Advisor
Posts: 1,294

Re: Generate all Combinations

Posted in reply to crawdaddy

And you are probably being asked to produce total points for each qualifying combination, yes?

Super User
Posts: 10,623

Re: Generate all Combinations

Posted in reply to crawdaddy
I is more like a OR problem .
Better posted it at OR forum @RobPratt is there .
If you have SAS/IML, maybe I could write some Genetic Algorithm for you .



Solution
‎09-18-2017 01:32 PM
Trusted Advisor
Posts: 1,294

Re: Generate all Combinations

Posted in reply to crawdaddy

Without getting in into how to do this in SAS, ... as a strategy how about:

 

  1. Make 5 datasets, one per group
    1. Vegetable dataset: All 3-element combinations of Vegetables
    2. Fruits: All 2-element combinations of Fruits
    3. etc. etc.
  2. Do a full cartesian crossing of the 5 datasets.

 

 

 

New Contributor
Posts: 2

Re: Generate all Combinations

Yes, this was exactly what I needed in order to generate all of the combinations.  

 

This was the start I needed to get the ball rolling.  

 

I have additional rules I need to account for, however I think I'll be able to handle those myself.

Respected Advisor
Posts: 4,569

Re: Generate all Combinations

[ Edited ]
Posted in reply to crawdaddy

@crawdaddy

To create first all the combination and then to filter the result is the standard way of doing this. The downside of such an approach is that you can end up with a huge amount of combinations and a very big intermediary table.

 

If you know in advance that you don't need to test all combinations, i.e. that for vegetables A B C is the same like B A C or C B A and that you only need to test one of these combinations then the sequential processing of the SAS data step offers often alternative design approaches allowing you to only create and test the combinations you really need.

Such an approach can significantly outperform anything that relies on creating all possible combinations (execute below sample code and you'll see).

The disadvantage of such a data step approach can be quite complex code.

 

Below an example how this could look like using SAS Datastep coding.

data have;
  infile datalines truncover dlm='|' dsd;
  input Food:$40. Group_Name:$20. Points;
datalines;
Grains|Brown Rice|2
Grains|Quinoa|2
Grains|White Rice|3
Grains|English Muffin|4
Fruits|Watermelon|3
Fruits|Apple|2
Fruits|Peach|1
Fruits|Grapes|4
Fruits|Canned Fruit|4
Fruits|Pear|1
Fruits|Mango|2
Vegetables/Legumes|Iceberg Lettuce|2
Vegetables/Legumes|Squash|2
Vegetables/Legumes|Eggplant|1
Vegetables/Legumes|Kale|3
Vegetables/Legumes|Red Cabbage|4
Vegetables/Legumes|Asparagus|3
Vegetables/Legumes|Sweet Potato|2
Vegetables/Legumes|White Potato|1
Vegetables/Legumes|Corn on Cob|1
Vegetables/Legumes|Red Lentils|4
Vegetables/Legumes|Tomato|2
Vegetables/Legumes|Bok Choy|3
Vegetables/Legumes|Frozen Vegetables|1
Vegetables/Legumes|Red Onion|2
Sweets|Ice Cream|1
Sweets|Pudding|1
Sweets|Dark Chocolate|4
Sweets|Milk Chocolate|3
Lean Meats|Chicken|2
Lean Meats|Fish|4
Lean Meats|Ground Turkey|2
Lean Meats|Fried Fish|1
;
run;

options dlcreatedir;
libname group 'c:\memlib' memlib;
data group.Grain group.Fruit group.Veg group.Sweet group.Meat missed;
  set have;
  select(food);
    when('Grains')              output group.Grain;
    when('Fruits')              output group.Fruit;
    when('Vegetables/Legumes')  output group.Veg;
    when('Sweets')              output group.Sweet;
    when('Lean Meats')          output group.Meat;
    otherwise output missed;
  end;
run;

data _null_;
  if nobs>0 then
    do;
      put "Missed cases. Fix code";
      abort;
    end;
  stop;
  set missed nobs=nobs;
run;

data want(drop=food group_name points);

  array food_group  {8} $20 Grain Fruit1 Fruit2 Veg1 Veg2 Veg3 Sweet Meat; 
  array point_group {8} _temporary_;

  set group.Grain;
  food_group[1] =Group_Name;
  point_group[1]=points;

  do pFruit=1 to nFruit;
    set group.fruit point=pFruit nobs=nFruit;
    food_group[2] =Group_Name;
    point_group[2]=points;

    do pFruit2=pFruit+1 to nFruit;
      set group.fruit point=pFruit2;
      food_group[3] =Group_Name;
      point_group[3]=points;

      do pVeg=1 to nVeg;
        set group.Veg point=pVeg nobs=nVeg;
        food_group[4] =Group_Name;
        point_group[4]=points;

        do pVeg2=pVeg+1 to nVeg;
          set group.Veg point=pVeg2;
          food_group[5] =Group_Name;
          point_group[5]=points;

          do pVeg3=pVeg2+1 to nVeg;
            set group.Veg point=pVeg3;
            food_group[6] =Group_Name;
            point_group[6]=points;

            do pSweet=1 to nSweet;
              set group.Sweet point=pSweet nobs=nSweet;
              food_group[7] =Group_Name;
              point_group[7]=points;

              do pMeat=1 to nMeat;
                set group.Meat point=pMeat nobs=nMeat;
                food_group[8] =Group_Name;
                point_group[8]=points;
                point_sum=sum(of point_group[*]);
                if point_sum<=25 then output;

              end; /* pMeat */
            end; /* pSweet */
          end; /* pVeg3 */
        end; /* pVeg2 */
      end; /* pVeg */
    end; /* pFruit2 */
  end; /* pFruit */

run;

libname group clear;
Esteemed Advisor
Posts: 5,408

Re: Generate all Combinations

Posted in reply to crawdaddy

IN THEORY, proc plan could be used to generate all possible combinations, something like:

 

ods listing select none;
proc plan;
factors 
    block = 5870592 ordered
    grai = 1 of 4 comb
    frui = 2 of 7 comb
    vege = 3 of 14 comb
    swee = 1 of 4 comb
    lean = 1 of 4 comb;
ods output plan=design;
run;
ods listing select all;

but it crashes with INTEGER OVERFLOW. It might work with a 64 bit version of SAS.

 

PG
Super User
Posts: 13,084

Re: Generate all Combinations

Posted in reply to crawdaddy

When you say  2 Fruit, 3 Vegetables must the 2 fruits be unique or are they allowed to duplicate, same for the vegetables?

Must all groups be represented with that exact count of items?

Is the list of items shown exhaustive, meaning there are no other grains, fruits etc. in the actual problem other than those shown?

 

And what should the output actually look like? I can think of one approach involving nested do loops that would work but may not provide the prettiest output.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 531 views
  • 2 likes
  • 6 in conversation