DATA Step, Macro, Functions and more

All possible shopping lists

Accepted Solution Solved
Reply
Super Contributor
Posts: 413
Accepted Solution

All possible shopping lists

[ Edited ]

Hi,

 

suppose that there are 5 items in a store:

item price
1 10
2 20
3 30
4 40
5 50

 

Assuming that you can buy each item only once, how would it be possible to create all possible shopping lists (consisting of1 item, 2 items ... 5 items) and to calculate the total price of each shopping list.

 

Thanks!  

 


Accepted Solutions
Solution
‎01-15-2017 11:08 PM
Super User
Posts: 5,093

Re: All possible shopping lists

[ Edited ]

And a macro version ... not very flexible and making assumptions about the required parameters and processing ...

 

%macro combinations (have=, want=, n_items=);

 

%local i;

 

data &want;

array items {&n_items} item1-item&n_items;

array prices {&n_items} price1-price&n_items;

array v {&n_items};

do i=1 to 5;

   set &have;

   items{i} = item;

   prices{i} = price;

end;

%do i=1 %to &n_items;

   do v&i = 0, price&i;

%end;

length item_list $ &n_items;

item_list = repeat('N', &n_items-1);

   do i=1 to &n_items;

      if v{i} then substr(item_list, i, 1)='Y';

   end;

   total_cost = sum(of v{*});

   output;

%do i=1 %to &n_items;

   end;

%end;

stop;

run;

 

%mend combinations;

 

The code is untested at this point.  

 

Even though I've written several programs of this type, I've never seen fit to convert it to a macro.  The requirements always seem to change enough each time so that it's easier to copy and modify a SAS program.

 

View solution in original post


All Replies
PROC Star
Posts: 7,364

Re: All possible shopping lists

[ Edited ]

I'm not sure if I understand what you're looking for, but it sounds like you want all combinations of 1 out of 5, 2 out of 5, etc.

 

There is a much easier solution available if you have access to proc plan. The following does the same thing through a series of macros and data steps. The final solution will be in a file called 'want'.

 

An explanation of the code can be found at: https://www.google.ca/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=0ahUKEwiO6-H_8sDRA...

 

HTH,

Art, CEO, AnalystFinder.com

 

data fmtdata;
infile cards dlm='09'x;
retain fmtname 'pcodes' type 'I';
input start label;
cards;
1 10
2 20
3 30
4 40
5 50
;
proc format cntlin = fmtdata;
run;

%macro comb(dsout2=&dsout,dsin2=__d1,k=,n=,len=) ;
%local dsin2 dsout2 numvar i start stop n k len ;
%let start = 1 ;
%let stop = %eval( &n - &k) ;
data &dsout2 ;
set &dsin2 ;
array _vals {&n} $ ;
array combo {&k} $ &len ;
keep j combo1 - combo&k ;
retain j 1 ;
%do i = 1 %to &k;
%let stop = %eval(&stop + 1) ;
do var&i = &start to &stop ;
%let start = %str(%(var&i + 1 %)) ;
%end ;
%do i = 1 %to &k ;
combo{&i} = _vals{var&i} ;
%end ;
output ;
j + 1 ;
%do i = 1 %to &k ;
end ;
%end ;
run;
%mend comb;

%macro setup(dsout=_NULL_,dsin=&syslast,var=,ks=) ;
%local dsout dsin var totvar ks ;
proc freq data=&dsin noprint ;
tables &var / out=__count ;
data _NULL_ ;
set __count end=last ;
retain len oldlen 0 ;
nk + 1 ;
len = length(&var) ;
if ( len > oldlen ) then oldlen = len ;
if count > 1 then do;
put "WARNING: The variable %upcase(&var)" /
"WARNING: has “ count “ occurrences of the value "
var;
end ;
if last then do ;
if &ks > nk then do;
put "ERROR: K greater than the Number of values (N)" ;
put "ERROR: Value of K: &ks Value of N: " nk ;
put "ERROR: EXITING Macro!!!!!!!!!!!!!!!!!!" ;
end ;
call symput('totvar',trim(left(put(nk,5.))) ) ;
call symput('length',put(oldlen,3.) );
end ;
run ;
%if &ks > &totvar %then %goto exit ;
data __d1 ;
set __count end=last ;
array _vals {&totvar} $ &length;
retain _vals1-_vals&totvar ;
keep _vals1-_vals&totvar ;
_vals{_N_} = &var ;
if last then output ;
%comb(k=&ks,n=&totvar,len=&length)
run ;
%exit: %mend setup ;

%setup(var=start,ks=1,dsin=fmtdata,dsout=combin1)
%setup(var=start,ks=2,dsin=fmtdata,dsout=combin2)
%setup(var=start,ks=3,dsin=fmtdata,dsout=combin3)
%setup(var=start,ks=4,dsin=fmtdata,dsout=combin4)
%setup(var=start,ks=5,dsin=fmtdata,dsout=combin5)

data want (drop=i j);
set combin1 (rename=(combo1=item1))
combin2 (rename=(combo1-combo2=item1-item2))
combin3 (rename=(combo1-combo3=item1-item3))
combin4 (rename=(combo1-combo4=item1-item4))
combin5 (rename=(combo1-combo5=item1-item5));
array combos(5) item1-item5;
array prices(5);
do i=1 to 5;
prices(i)=input(combos(i),pcodes.);
end;
total=sum(of prices(*));
run;

 

Super Contributor
Posts: 413

Re: All possible shopping lists

hi art297,

 

I ran the code but it gave me an empty "want" data table.

 

Also, data fmtdata has 4 columns and 2 observations, and not the item and column data - maybe that is the reason why want is empty?

PROC Star
Posts: 7,364

Re: All possible shopping lists

It didn't work because the tab characters (in the first data step) were lost in the post. I changed them to commas, as shown below, and the code produces the desired file. I only posted it because I noticed that no one had responded by the time I saw your initial request. However, while it does indeed work, the other responses had much more efficient code.

 

data fmtdata;
infile cards dlm=',';
retain fmtname 'pcodes' type 'I';
input start label;
cards;
1,10
2,20
3,30
4,40
5,50
;

 

proc format cntlin = fmtdata;
run;
%macro comb(dsout2=&dsout,dsin2=__d1,k=,n=,len=) ;
%local dsin2 dsout2 numvar i start stop n k len ;
%let start = 1 ;
%let stop = %eval( &n - &k) ;
data &dsout2 ;
set &dsin2 ;
array _vals {&n} $ ;
array combo {&k} $ &len ;
keep j combo1 - combo&k ;
retain j 1 ;
%do i = 1 %to &k;
%let stop = %eval(&stop + 1) ;
do var&i = &start to &stop ;
%let start = %str(%(var&i + 1 %)) ;
%end ;
%do i = 1 %to &k ;
combo{&i} = _vals{var&i} ;
%end ;
output ;
j + 1 ;
%do i = 1 %to &k ;
end ;
%end ;
run;
%mend comb;
%macro setup(dsout=_NULL_,dsin=&syslast,var=,ks=) ;
%local dsout dsin var totvar ks ;
proc freq data=&dsin noprint ;
tables &var / out=__count ;
data _NULL_ ;
set __count end=last ;
retain len oldlen 0 ;
nk + 1 ;
len = length(&var) ;
if ( len > oldlen ) then oldlen = len ;
if count > 1 then do;
put "WARNING: The variable %upcase(&var)" /
"WARNING: has “ count “ occurrences of the value "
var;
end ;
if last then do ;
if &ks > nk then do;
put "ERROR: K greater than the Number of values (N)" ;
put "ERROR: Value of K: &ks Value of N: " nk ;
put "ERROR: EXITING Macro!!!!!!!!!!!!!!!!!!" ;
end ;
call symput('totvar',trim(left(put(nk,5.))) ) ;
call symput('length',put(oldlen,3.) );
end ;
run ;
%if &ks > &totvar %then %goto exit ;
data __d1 ;
set __count end=last ;
array _vals {&totvar} $ &length;
retain _vals1-_vals&totvar ;
keep _vals1-_vals&totvar ;
_vals{_N_} = &var ;
if last then output ;
%comb(k=&ks,n=&totvar,len=&length)
run ;
%exit: %mend setup ;
%setup(var=start,ks=1,dsin=fmtdata,dsout=combin1)
%setup(var=start,ks=2,dsin=fmtdata,dsout=combin2)
%setup(var=start,ks=3,dsin=fmtdata,dsout=combin3)
%setup(var=start,ks=4,dsin=fmtdata,dsout=combin4)
%setup(var=start,ks=5,dsin=fmtdata,dsout=combin5)
data want (drop=i j);
set combin1 (rename=(combo1=item1))
combin2 (rename=(combo1-combo2=item1-item2))
combin3 (rename=(combo1-combo3=item1-item3))
combin4 (rename=(combo1-combo4=item1-item4))
combin5 (rename=(combo1-combo5=item1-item5));
array combos(5) item1-item5;
array prices(5);
do i=1 to 5;
prices(i)=input(combos(i),pcodes.);
end;
total=sum(of prices(*));
run;

 

Super User
Posts: 17,912

Re: All possible shopping lists

See the solutions here:

https://communities.sas.com/t5/Base-SAS-Programming/Selecting-distinct-Combinations/m-p/318086

 

@Ksharp's solution using GRAYCODE() is probably the easiest. You can also load the temporary array from a data set if desired. 

 

Trusted Advisor
Posts: 1,401

Re: All possible shopping lists

Tested solution given:

 

data prices;
  input item price;
  n+1; drop n;
  call symput('items',left(n));
datalines;
1 10
2 20
3 30
4 40
5 50
; run;
%put ITEMS= &items;

proc transpose data=prices 
         out=price_array(drop=_name_)
         prefix=I;       
  var price;
  id item;
run;

proc sql;
  select name into: vnames separated by ' '
  from sashelp.vcolumn where memname=upcase('price_array');
quit;
%put NAMES= &vnames;

data _null_;
   varnames = '"'||tranwrd("&vnames",' ','" "')||'"';
   call symput('varnames',varnames);
run;

data shop_list;
     length items_list $100;
     retain items_list;
 set price_array;
    array ix {&items} $ (&varnames);
    array pr {&items} &vnames;
    
    do i=1 to &items;
       items_list = ' '; 
       price = pr(i); 
       items_list = ix(i);
       output;
       if i < &items then do;          
          do j=i+1 to &items;
             items_list = catx(' ',items_list,ix(j));
             price = price + pr(j);
             output;
          end;
       end;
   end;
   keep items_list price;
run;




    

Super Contributor
Posts: 413

Re: All possible shopping lists

[ Edited ]

Hi Shmuel,

 

I ran the code, but it seems to be missing some combinations. For example, it omits combinations such as (i1,i5) or (i1,i2,i4).

There should be 31 combinations (if you don't count the empty combination), but here there are 15. 

 

Thanks!

Super Contributor
Posts: 413

Re: All possible shopping lists

thanks Reeza for the link, using your solution I could also obtain the total price of each shopping list! 

 

Also useful to learn about this graycode function, althought just from its name its not so obvious that it has got something to do with combinations!

Super User
Posts: 17,912

Re: All possible shopping lists

You can load a temporary array from a dataset. 

See the example here:

https://gist.github.com/statgeek/f052b5223fecca066b1f

Super Contributor
Posts: 413

Re: All possible shopping lists

I see, that is how it is possible to get the observations of a variable into an array without transposing (related to my reply to Astounding)

Super User
Posts: 5,093

Re: All possible shopping lists

Since we're assuming there are 5 items, there is no need for macro language.  The program can be simplified to:

 

data want;

array items {5} item1-item5;

array prices {5} price1-price5;

array v {5} v1-v5;

if _n_=1 then do i=1 to 5;

   set have;

   items{i}=item;

   prices{i}=price;

end;

 

That much in effect transposes the existing data.  Then continue the same DATA step to find all combinations:

 

do v1=0, price1;

do v2=0, price2;

do v3=0, price3;

do v4=0, price4;

do v5=0, price5;

   item_list='NNNNN';

   do i=1 to 5;

      if v{i} then substr(item_list, i, 1)='Y';

   end;

   total_cost = sum(of v1-v5);

   output;

end; end; end; end; end;

stop;

run;

 

Now this does give you an extra observation, where no items were purchased.  You get the total cost, plus a 5-character ITEM_LIST consisting of a set of "N" and "Y" values.  If the first character is "Y", the first item was purchased.  If the second character is "Y", the second item was purchased, etc.

Super Contributor
Posts: 413

Re: All possible shopping lists

Hi Astounding,

 

based on Ksharp's first solution in the link that Reeza posted, I managed to write a code that gets a solution similar to yours, although it is in several steps:

 

data have;
  input item price;
datalines;
1 10
2 20
3 30
4 40
5 50
; 
run;
/*first get the total prices of all combinations */
proc transpose data=have out=temp;
 var price;
run;

proc summary data=temp;
 class col:;
 output out=total_price;
run;

data total_price;
set total_price;
total_price = sum(of col1-col5);
run;

/*now get the combinations of the item names*/
proc transpose data=have out=temp2;
 var item;
run;
proc summary data=temp2;
 class col:;
 output out=item_names;
run;

data item_names;
 set item_names;
 length item_names $ 20;
 item_names=catx(' ',of col:);
 keep item_names _type_;
run;

/*and now merging item_names with total_price by _TYPE_*/
DATA shopping_lists;
 MERGE total_price item_names;
 BY _TYPE_;
 RUN;

Your code is in one step, and it has do-end loops for each item. Is it possible to have a macro which will do the process automatically for any number of items? 

Super User
Posts: 5,093

Re: All possible shopping lists

It's possible, but there is a lot to consider concerning what parameters should exist.  Normally, any of these would represent possibilities.

 

Name of the input data set

Name of the output data set

Maximum number of items to select ... will it always match the maximum number available in the data set?

Names of the PRICE variables (are they always price1-priceN?)

Names of the ITEM variables (are they always item1-itemN?)

 

Other considerations that would affect how the program (or how the macro) is written:

 

Do you want a variable showing how many items were selected?

Can any PRICE values be zero?

Super Contributor
Posts: 413

Re: All possible shopping lists

[ Edited ]

I guess that macros in this case would be complex. I did manage to do it in one step by using Reeza's solution to the link that he posted by adding a second price array to his code. Here it is:

 

data want;
 *can load temporary array from dataset if required;

 array x1[5]  _temporary_ (10 20 30 40 50  );
 array x2[5] $2 _temporary_ ('1' '2' '3' '4' '5');
 *array for output;
 array y1(5) ;
 array y2(5);

 n=dim(x1);

 *loop overall all possible output ranges;
 do k=1 to n;

 *Determine number of combinations required;
 ncomb=comb(n, k);

  *Create all combinations;
  do j=1 to ncomb;

     *create the combination;
     call lexcomb(j, k, of x1[*]);
     call lexcomb(j, k, of x2[*]);  
     *Set all values to missing;
     call missing(of y1(*));
     call missing(of y2(*)); 
     *Copy over the values to the array for output;
     do i=1 to k;
        y1(i)=x1(i);
		y2(i)=x2(i);
     end;
 
     *output record;
     output;
 
  end;
end;
run;

data want;
set want;
total_price = sum (of y11 - y15);
run;

Maybe this code could be turned into a macro if the item and price variables can be put inside the 2 temporary arrays. I am not sure if the observations of a row can be put into an array, unless it is necessary to transpose the "have" and like this put the now columns into the temporary arrays 

Solution
‎01-15-2017 11:08 PM
Super User
Posts: 5,093

Re: All possible shopping lists

[ Edited ]

And a macro version ... not very flexible and making assumptions about the required parameters and processing ...

 

%macro combinations (have=, want=, n_items=);

 

%local i;

 

data &want;

array items {&n_items} item1-item&n_items;

array prices {&n_items} price1-price&n_items;

array v {&n_items};

do i=1 to 5;

   set &have;

   items{i} = item;

   prices{i} = price;

end;

%do i=1 %to &n_items;

   do v&i = 0, price&i;

%end;

length item_list $ &n_items;

item_list = repeat('N', &n_items-1);

   do i=1 to &n_items;

      if v{i} then substr(item_list, i, 1)='Y';

   end;

   total_cost = sum(of v{*});

   output;

%do i=1 %to &n_items;

   end;

%end;

stop;

run;

 

%mend combinations;

 

The code is untested at this point.  

 

Even though I've written several programs of this type, I've never seen fit to convert it to a macro.  The requirements always seem to change enough each time so that it's easier to copy and modify a SAS program.

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 15 replies
  • 265 views
  • 9 likes
  • 6 in conversation