BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BarryP
Quartz | Level 8

I've been struggling with this for a while and thought I'd throw the question up here to see what happens.

Consider this hypothetical.  I've got a company that sells widgets.  

1 for $5
2 for $9
3 for $13
4 for $16

Customer A buys 1, Customer B buys 2, C buys 3, D buys 4.  This can be solved with a simple merge.  

What if I have a customer who buys 9?  He would basically buy 4, 4, and 1 for 16 + 16 + 5 = $37.


Is there a way to do this with some sort of Do Until loop?  

I've already solved this by hard coding the transformations with do until loops and if then statements, but I'd like to do it with a lookup tables so that I can change the prices later without having to change the program.

I have also figured out how to do this by just coping and pasting the merge code and repeating the data step over and over again.  The problem there is I would have to guess what the maximum time I would ever need to do this would be.  For example, with this case I would run the code three times.  However, if a customer buys 17 widgets, I'd have to run the code 5 times.

Any thoughts?  Thanks! 


1 ACCEPTED SOLUTION

Accepted Solutions
yabwon
Onyx | Level 15

hi @BarryP ,

 

I did some experiments with your setup, and with one assumption that the "dictionary of translation of instruments" is sorted in the order you want it to be (i.e. in order of precedence) the following code seems to do the job.

 

data dict;
input A B C X Y Z;
cards;
2 2 0 3 1 0
2 1 0 2 0 1
1 1 0 2 1 0
1 0 1 1 0 2
;
run;

data customers;
input instr_A instr_B instr_C;
cards;
6 5 1
4 4 0
2 2 0
2 1 0
1 1 0
3 3 0
1 0 1
3 0 3
2 1 1
;
run;

/*options ps = max ls = max;*/
data want;
  set customers;
  array instr instr_A instr_B instr_C;
  array test t_A t_B t_C;
  array new instr_X instr_Y instr_Z;
  array abc a b c;
  array xyz x y z;
  call missing(of new[*]);

  point = 1;
  do until(stop);    
    set dict point = point nobs = nobs;
      do over instr;
        test = instr;
      end;
      do over test;
        test = test - abc;
        test = (test < 0);
      end;
      tot_test = sum(of test[*]);
    
      put _all_;
      if tot_test then point + 1;
      else
        do;
          do over instr;
            instr = instr - abc;
            new + xyz;
          end;            
        end;
     
      stop = (point > nobs) or sum(of instr[*]) = 0;
  end;
  output;
  put;
  keep instr_:;
run;

data final;
  set want;
  set customers;
run;

 

All the best

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



View solution in original post

36 REPLIES 36
PaigeMiller
Diamond | Level 26

Iterations don't seem necessary.

 

How about using math? If a customer buys 9, then 9 integer divide by 4 is 2, plus a remainder of 1. So the math is 2*(cost of 4) + (cost of 1). If your lookup table is an array which I shall creatively name COST, then for any quantity purchased n

 

if mod(n, dim(cost)) = 0 then total_cost = floor(n/dim(cost)) * cost(dim(cost));
else total_cost = floor(n/dim(cost)) * cost(dim(cost)) + cost(mod(n, dim(cost)));
 
 
--
Paige Miller
BarryP
Quartz | Level 8

Hi Page, can you help me out with that a bit?  I thought that maybe an "array" might help.  I've taken the first 2 SAS programming classes and have about 2 years experience.  I'm sure there's a new technique I need do delve into.

Can you give me a little more detail?  Thanks!

FWIW, the way I would normally do this would be:

GIVEN TABLES:

Customer_Table:

Customer   Num_items
A                       1

B                       2
C                       3
D                       4
E                       9


Cost_Table

Num_items     Cost
1                                $5

2                                $9

3                                $13

4                                $16
9                                $37


SAS CODE: 

Data New_Table;

      merge Customer_Table Cost_Table;
     by Num_items;

run;


The problem here, or course, is that I'd have to have a cost for every possible value for num_items.


                                

PaigeMiller
Diamond | Level 26

So data set Customer is as you show it.

 

Then

 

data want;
     array cost cost1-cost4 (5 9 13 16);
     set customer;
     if mod(num_items, dim(cost)) = 0 then total_cost = 
        floor(num_items/dim(cost)) * cost(dim(cost));
    else total_cost = floor(num_items/dim(cost)) * cost(dim(cost)) + 
        cost(mod(num_items, dim(cost)));
run;
 
--
Paige Miller
Patrick
Opal | Level 21

Code using a lookup table could look like below.

data price;
  input quantity price;
  datalines;
1 5
2 9
3 13
4 16
;

data sample;
  input quantityBought;
  datalines;
9
25
5
;

data want(drop=_:);
  if _n_=1 then
    do;
      if 0 then set price;
      dcl hash h1(dataset:'price', ordered:'y', hashexp:4);
      dcl hiter hh1('h1');
      h1.defineKey('quantity');
      h1.defineData('quantity', 'price');
      h1.defineDone();
    end;

  set sample;
  _remain=quantityBought;
  _rc = hh1.last();
  do while (_rc = 0);
    _factor=int(_remain/quantity);
    if _factor>0 then
      do;
        PriceSum=sum(PriceSum, _factor*price);
        _remain=_remain-_factor*quantity;
        if _remain<=0 then leave;
      end;
    _rc = hh1.prev();
  end;
run;

proc print data=want;
run;
BarryP
Quartz | Level 8

Hmmm, this may be the way to do it.  I don't understand all of your code, but I'm guessing there's something in there that pulls elements out of the price table and treats them like variables?


Patrick
Opal | Level 21

@BarryP wrote:

Hmmm, this may be the way to do it.  I don't understand all of your code, but I'm guessing there's something in there that pulls elements out of the price table and treats them like variables?



Given that you're using SAS since two years it's likely worth that you skill-up with the SAS hash and hash iterator object.

BarryP
Quartz | Level 8

Looking it up now.  😉

mkeintz
PROC Star

To be honest, I don't see this single-product problem as a very good example of the benefits of hash, especially because the priced quantities are integers from 1 to 4.  @PaigeMiller's code appears to be very compact for this problem, using an array indexed by the integers 1:4.

 

If what your are really looking for is having the prices in a separated data set, @PaigeMiller's code still works.  It just has to be preceded by a proc transpose:

 

data prices;
  input cost;
datalines;
5
6
13
16
run;
proc transpose data=prices out=costs (drop=_name_) prefix=cost;
run;

data orders;
  input qty_ordered;
datalines;
5
25
19
run;

data want;
  set orders;
  if _n_=1 then set costs;
  array _prcs {*} cost:  ;
  total_cost=floor(qty_ordered/dim(_prcs))*_prcs{dim(_prcs)};
  if mod(qty_ordered,dim(_prcs))^=0 then total_cost=total_cost + _prcs{mod(qty_ordered,dim(_prcs))};
run;

 

However, I think using hash would be very educational if you had prices for multiple products ("A","B", etc.) up in the prices dataset.  Then I could see enhancing @PaigeMiller's code with a hash, keyed on the product ID.

 

Code above changed per @PaigeMiller's observation.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

I agree with your comments about hash being unnecessary for the problem as stated.

 

Your code needs to be adjusted for the case where this quantity is equal to zero.

mod(qty_ordered,dim(_prcs)) 
 
--
Paige Miller
BarryP
Quartz | Level 8

I used the price and quantity to simplify the problem.  The real problem is a little more complicated.  We are replacing combinations of lab instruments with different combinations of new lab instruments.  So its more like:

1 Inst_A & 1 Inst_B = 2 inst_X & 1 inst_Y
2 inst_A & 1 Inst_B = 1 inst_Z
2 inst_A & 2 Inst_B = 3 inst_X & 1 inst_Y
1 inst_A & 1 inst_C = 1 inst_X & 2 inst_Z

etc.

I have a list of ~100 different combinations of 6 instruments and what they translate to (different combinations of 5 instruments).  The lookup table accounts for about 80% of our customer base.  Its the remaining 20% that I'm trying to write algorithms for.  

For example, I have a lookup of 6 Inst_B & 1 Inst_C but I have a customer with 13 Inst_B and 4_Inst_C.  My current algorithm translates 6 Inst_B & 1 Inst_C, then 6 Inst_B & 1 Inst_C again, then 1 Inst_B & 1_Inst_C, and then finally 1 Inst_C.

Those translate to (respectively):

1 Inst_W, 3 Inst_Y, 1 Inst_Z
1 Inst_W, 3 Inst_Y, 1 Inst_Z
1 Inst_V, 1 Inst_Z
0 instruments

I add them together to = 1 Inst_B, 2 Inst_W, 6 Inst_Y, 3 Inst_Z.

When using the lookup table its pretty easy.  I have a code that says, for example, 6B1C and the table has columns for V, W, X, Y, Z and just pulls them in with the merge.  

I haven't had the chance to look at the two methods proposed yet, so I don't know if it makes a difference when pulling in 5 different columns from combinations of 6 different columns.

FWIW, here's a snippet of the code I'm currently using.  It works just fine, but I'm trying to learn more efficient methods.  Keep in mind that I omitted a bunch of code that makes this work.  The gist is that in the I run through the loop, adding the new instruments and subtracting the old ones until Loop_Control = 0.   


Do Until (Loop_Control = 0);

     If Inst_B = 1 and Inst_C = 0 and Profile = 'A&B Only' then do;

          Inst_V = Inst_V + 1;
          Inst_W = Inst_W + 0;
          Inst_X = Inst_X + 0;
          Inst_Y = Inst_Y + 0;
          Inst_Z = Inst_Z + 1;

          Inst_B = Inst_B - 1;
          Inst_C = Inst_C - 0;

     end;

 If Inst_B = 1 and Inst_C = 1 and Profile = 'A&B Only' then do;

          Inst_V = Inst_V + 1;
          Inst_W = Inst_W + 0;
          Inst_X = Inst_X + 0;
          Inst_Y = Inst_Y + 0;
          Inst_Z = Inst_Z + 1;

          Inst_B = Inst_B - 1;
          Inst_C = Inst_C - 1;

     end;

If Inst_B gt 1 and Inst_C = 1 and Profile = 'A&B Only' then do;

          subprofile = 'out of scope';
          Inst_V = Inst_V + 1;
          Inst_W = Inst_W + 0;
          Inst_X = Inst_X + 0;
          Inst_Y = Inst_Y + 0;
          Inst_Z = Inst_Z + 1;

          Inst_B = Inst_B - 1;
          Inst_C = Inst_C - 1;

     end;

 

     If Profile = 'A&B Only' then Loop_Control = Inst_A;

     else Loop_Control = 0;

end;

 

 

mkeintz
PROC Star

@BarryP wrote:



For example, I have a lookup of 6 Inst_B & 1 Inst_C but I have a customer with 13 Inst_B and 4_Inst_C.  My current algorithm translates 6 Inst_B & 1 Inst_C, then 6 Inst_B & 1 Inst_C again, then 1 Inst_B & 1_Inst_C, and then finally 1 Inst_C.

Those translate to (respectively):

1 Inst_W, 3 Inst_Y, 1 Inst_Z
1 Inst_W, 3 Inst_Y, 1 Inst_Z
1 Inst_V, 1 Inst_Z
0 instruments

 

So in this example are you saying that you have the following specific instrument translations available? 

    - written here as (A,B,C,D,E,F)  (V,W,X,Y,Z):

 

(0,6,1,0,0,0)   (0,1,0,3,1)     From 0A, 6B, 1C, 0D, 0E, 0F   to   0V, 1W, 0X, 3Y, 1Z   (Used twice)

(0,1,1,0,0,0)   (1,0,0,0,1)     From 1B and 1C to 1V, 1Z  (Used once)

 

What is the translation for the single remaining 1C?   Is that the "0 instruments" line?   I don't understand quite understand that. 

 

Does it mean that for some customers, having a single "leftover" instrument of a particular type gets no translation (or officially translates to zero)?  I would ordinarily assume that, in addition to some standard list of lookup combinations, there would always be a translation for each of the single original instruments   Then you could account for leftovers.

 

 

In a way, this is a mathematical programming problem, where there may be more than one collection of translations that satisfy the customer order, but one would presumably choose the combination that minimizes the customer's cost - or optimizing some other criterion subject to the selected translations exactly satisfying the initial set of instruments.

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
BarryP
Quartz | Level 8

Yes, you understand it correctly.

I find the 0 translation weird, too, but I don't work in sales.  I just have to use the numbers that they give me.  

PaigeMiller
Diamond | Level 26

I used the price and quantity to simplify the problem. The real problem is a little more complicated. We are replacing combinations of lab instruments with different combinations of new lab instruments. So its more like:

 

The same math works product by product that I provided above. Essentially, you do the same programming for each instrument, and then add things up. Perhaps there are smarter or faster ways to do this, but I still don't see a need for any looping or iterations within a product. You could write a loop that goes over all products.

--
Paige Miller
yabwon
Onyx | Level 15

hi @BarryP ,

 

I did some experiments with your setup, and with one assumption that the "dictionary of translation of instruments" is sorted in the order you want it to be (i.e. in order of precedence) the following code seems to do the job.

 

data dict;
input A B C X Y Z;
cards;
2 2 0 3 1 0
2 1 0 2 0 1
1 1 0 2 1 0
1 0 1 1 0 2
;
run;

data customers;
input instr_A instr_B instr_C;
cards;
6 5 1
4 4 0
2 2 0
2 1 0
1 1 0
3 3 0
1 0 1
3 0 3
2 1 1
;
run;

/*options ps = max ls = max;*/
data want;
  set customers;
  array instr instr_A instr_B instr_C;
  array test t_A t_B t_C;
  array new instr_X instr_Y instr_Z;
  array abc a b c;
  array xyz x y z;
  call missing(of new[*]);

  point = 1;
  do until(stop);    
    set dict point = point nobs = nobs;
      do over instr;
        test = instr;
      end;
      do over test;
        test = test - abc;
        test = (test < 0);
      end;
      tot_test = sum(of test[*]);
    
      put _all_;
      if tot_test then point + 1;
      else
        do;
          do over instr;
            instr = instr - abc;
            new + xyz;
          end;            
        end;
     
      stop = (point > nobs) or sum(of instr[*]) = 0;
  end;
  output;
  put;
  keep instr_:;
run;

data final;
  set want;
  set customers;
run;

 

All the best

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 36 replies
  • 1017 views
  • 3 likes
  • 9 in conversation