BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
crawdaddy
Calcite | Level 5

Trying to generate a list of all possible combinations using the following data:

 

Food GroupNamePoints
GrainsBrown Rice2
GrainsQuinoa2
GrainsWhite Rice3
GrainsEnglish Muffin4
FruitsWatermelon3
FruitsApple2
FruitsPeach1
FruitsGrapes4
FruitsCanned Fruit4
FruitsPear1
FruitsMango2
Vegetables/LegumesIceberg Lettuce2
Vegetables/LegumesSquash2
Vegetables/LegumesEggplant1
Vegetables/LegumesKale3
Vegetables/LegumesRed Cabbage4
Vegetables/LegumesAsparagus3
Vegetables/LegumesSweet Potato2
Vegetables/LegumesWhite Potato1
Vegetables/LegumesCorn on Cob1
Vegetables/LegumesRed Lentils4
Vegetables/LegumesTomato2
Vegetables/LegumesBok Choy3
Vegetables/LegumesFrozen Vegetables1
Vegetables/LegumesRed Onion2
SweetsIce Cream1
SweetsPudding1
SweetsDark Chocolate4
SweetsMilk Chocolate3
Lean MeatsChicken2
Lean MeatsFish4
Lean MeatsGround Turkey2
Lean MeatsFried Fish1

 

My exercise is to generate all of the combinations where I have 1 Grain, 2 Fruit, 3 Vegetables, 1 Sweet, 1 Lean Meat where total points is less than or equal to 25.

 

I do not have access to PROC IML.

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Without getting in into how to do this in SAS, ... as a strategy how about:

 

  1. Make 5 datasets, one per group
    1. Vegetable dataset: All 3-element combinations of Vegetables
    2. Fruits: All 2-element combinations of Fruits
    3. etc. etc.
  2. Do a full cartesian crossing of the 5 datasets.

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

8 REPLIES 8
Patrick
Opal | Level 21

@crawdaddy

As this is an excercise we possibly shouldn't just give you the full solution. 

What solution approaches have you considered and what have you tried already?

 

mkeintz
PROC Star

And you are probably being asked to produce total points for each qualifying combination, yes?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
I is more like a OR problem .
Better posted it at OR forum @RobPratt is there .
If you have SAS/IML, maybe I could write some Genetic Algorithm for you .



mkeintz
PROC Star

Without getting in into how to do this in SAS, ... as a strategy how about:

 

  1. Make 5 datasets, one per group
    1. Vegetable dataset: All 3-element combinations of Vegetables
    2. Fruits: All 2-element combinations of Fruits
    3. etc. etc.
  2. Do a full cartesian crossing of the 5 datasets.

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
crawdaddy
Calcite | Level 5

Yes, this was exactly what I needed in order to generate all of the combinations.  

 

This was the start I needed to get the ball rolling.  

 

I have additional rules I need to account for, however I think I'll be able to handle those myself.

Patrick
Opal | Level 21

@crawdaddy

To create first all the combination and then to filter the result is the standard way of doing this. The downside of such an approach is that you can end up with a huge amount of combinations and a very big intermediary table.

 

If you know in advance that you don't need to test all combinations, i.e. that for vegetables A B C is the same like B A C or C B A and that you only need to test one of these combinations then the sequential processing of the SAS data step offers often alternative design approaches allowing you to only create and test the combinations you really need.

Such an approach can significantly outperform anything that relies on creating all possible combinations (execute below sample code and you'll see).

The disadvantage of such a data step approach can be quite complex code.

 

Below an example how this could look like using SAS Datastep coding.

data have;
  infile datalines truncover dlm='|' dsd;
  input Food:$40. Group_Name:$20. Points;
datalines;
Grains|Brown Rice|2
Grains|Quinoa|2
Grains|White Rice|3
Grains|English Muffin|4
Fruits|Watermelon|3
Fruits|Apple|2
Fruits|Peach|1
Fruits|Grapes|4
Fruits|Canned Fruit|4
Fruits|Pear|1
Fruits|Mango|2
Vegetables/Legumes|Iceberg Lettuce|2
Vegetables/Legumes|Squash|2
Vegetables/Legumes|Eggplant|1
Vegetables/Legumes|Kale|3
Vegetables/Legumes|Red Cabbage|4
Vegetables/Legumes|Asparagus|3
Vegetables/Legumes|Sweet Potato|2
Vegetables/Legumes|White Potato|1
Vegetables/Legumes|Corn on Cob|1
Vegetables/Legumes|Red Lentils|4
Vegetables/Legumes|Tomato|2
Vegetables/Legumes|Bok Choy|3
Vegetables/Legumes|Frozen Vegetables|1
Vegetables/Legumes|Red Onion|2
Sweets|Ice Cream|1
Sweets|Pudding|1
Sweets|Dark Chocolate|4
Sweets|Milk Chocolate|3
Lean Meats|Chicken|2
Lean Meats|Fish|4
Lean Meats|Ground Turkey|2
Lean Meats|Fried Fish|1
;
run;

options dlcreatedir;
libname group 'c:\memlib' memlib;
data group.Grain group.Fruit group.Veg group.Sweet group.Meat missed;
  set have;
  select(food);
    when('Grains')              output group.Grain;
    when('Fruits')              output group.Fruit;
    when('Vegetables/Legumes')  output group.Veg;
    when('Sweets')              output group.Sweet;
    when('Lean Meats')          output group.Meat;
    otherwise output missed;
  end;
run;

data _null_;
  if nobs>0 then
    do;
      put "Missed cases. Fix code";
      abort;
    end;
  stop;
  set missed nobs=nobs;
run;

data want(drop=food group_name points);

  array food_group  {8} $20 Grain Fruit1 Fruit2 Veg1 Veg2 Veg3 Sweet Meat; 
  array point_group {8} _temporary_;

  set group.Grain;
  food_group[1] =Group_Name;
  point_group[1]=points;

  do pFruit=1 to nFruit;
    set group.fruit point=pFruit nobs=nFruit;
    food_group[2] =Group_Name;
    point_group[2]=points;

    do pFruit2=pFruit+1 to nFruit;
      set group.fruit point=pFruit2;
      food_group[3] =Group_Name;
      point_group[3]=points;

      do pVeg=1 to nVeg;
        set group.Veg point=pVeg nobs=nVeg;
        food_group[4] =Group_Name;
        point_group[4]=points;

        do pVeg2=pVeg+1 to nVeg;
          set group.Veg point=pVeg2;
          food_group[5] =Group_Name;
          point_group[5]=points;

          do pVeg3=pVeg2+1 to nVeg;
            set group.Veg point=pVeg3;
            food_group[6] =Group_Name;
            point_group[6]=points;

            do pSweet=1 to nSweet;
              set group.Sweet point=pSweet nobs=nSweet;
              food_group[7] =Group_Name;
              point_group[7]=points;

              do pMeat=1 to nMeat;
                set group.Meat point=pMeat nobs=nMeat;
                food_group[8] =Group_Name;
                point_group[8]=points;
                point_sum=sum(of point_group[*]);
                if point_sum<=25 then output;

              end; /* pMeat */
            end; /* pSweet */
          end; /* pVeg3 */
        end; /* pVeg2 */
      end; /* pVeg */
    end; /* pFruit2 */
  end; /* pFruit */

run;

libname group clear;
PGStats
Opal | Level 21

IN THEORY, proc plan could be used to generate all possible combinations, something like:

 

ods listing select none;
proc plan;
factors 
    block = 5870592 ordered
    grai = 1 of 4 comb
    frui = 2 of 7 comb
    vege = 3 of 14 comb
    swee = 1 of 4 comb
    lean = 1 of 4 comb;
ods output plan=design;
run;
ods listing select all;

but it crashes with INTEGER OVERFLOW. It might work with a 64 bit version of SAS.

 

PG
ballardw
Super User

When you say  2 Fruit, 3 Vegetables must the 2 fruits be unique or are they allowed to duplicate, same for the vegetables?

Must all groups be represented with that exact count of items?

Is the list of items shown exhaustive, meaning there are no other grains, fruits etc. in the actual problem other than those shown?

 

And what should the output actually look like? I can think of one approach involving nested do loops that would work but may not provide the prettiest output.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 5416 views
  • 2 likes
  • 6 in conversation