Solved: Re: How to use a lookup table using iterations? - Page 2

BarryP · Posted 01-21-2020 05:02 PM

Hi @yabwon

I'm pretty sure this is going to work. I will likely have some questions about your code, so if you can, please check back in.

Thanks!

yabwon · Posted 01-22-2020 06:33 AM

Hi @BarryP,

Glad I could help. If you have any questions - ask 🙂

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation

BarryP · Posted 01-22-2020 11:23 PM

@yabwon

Hi Bart. Thanks again. I've figured out how to make this code work. I've figured out most of it but there were a few places where I could figure out what was happening but didn't understand how the syntax worked. I'm hoping you can answer:

Basically I don't understand the purpose of the two put statements, and more importantly the test = (test < 0) statement. I know what it does by figuring out what the program needed to do under certain conditions, but I don't know how that statement works and couldn't figure out what to look up.

Thanks!

data want;
  set customers;
  array instr instr_A instr_B instr_C;
  array test t_A t_B t_C;
  array new instr_X instr_Y instr_Z;
  array abc a b c;
  array xyz x y z;
  call missing(of new[*]);

  point = 1;
  do until(stop);    
    set dict point = point nobs = nobs;
      do over instr;
        test = instr;
      end;

*********************************;
      do over test;		*;
        test = test - abc;	*;
        test = (test < 0);	*;
      end;			*;
      tot_test = sum(of test[*]) ;	
*********************************;
    
* I figured out that this will give tot_test a value 
	  if one of the elements of test - abc is less than 0
	  which causes point to increase by one and then the 
	  loop goes back to the top.  How?  
	  What is test = (test < 0) doing?
;


      put _all_;  *why?  This seems unnecessary;



      if tot_test then point + 1;
      else
        do;
          do over instr;
            instr = instr - abc;
            new + xyz;
          end;            
        end;
     
      stop = (point > nobs) or sum(of instr[*]) = 0;
  end;
  output;
  put;       *why?  This seems unnecessary;
  keep instr_:;
run;

BarryP · Posted 01-22-2020 11:57 PM

@yabwon

Okay after some more experimenting it's clear that this line:

test = (test < 0) turns each variable in test into a 1 if it is < 0, or a zero if it is greater than or equal to zero.

The syntax seems strange. Is there a longhand way of writing this? Also, is there a source that explains this? I didn't even know what to search on.

Thanks!

yabwon · Posted 01-23-2020 03:10 AM

Hi @BarryP ,

the `put _all_;` is just to display status of the PDV vector in the log, I should comment it and sorry for that.

About the `total = (total < 0)` - I'm using here concept of "implicit" array declaration. Please check out this article for details:

http://support.sas.com/resources/papers/proceedings10/209-2010.pdf

and this:

https://support.sas.com/resources/papers/proceedings/proceedings/sugi30/242-30.pdf

All the best

Bart

P.S. Longhand version:

data want;
  set customers;
  array instr[3] instr_A instr_B instr_C;
  array test[3] t_A t_B t_C;
  array new[3] instr_X instr_Y instr_Z;
  array abc[3] a b c;
  array xyz[3] x y z;
  call missing(of new[*]);

  point = 1;
  do until(stop);    
    set dict point = point nobs = nobs;
      do _I_ = lbound(instr) to hbound(instr); drop _I_;
        test[_I_] = instr[_I_];
      end;

      do _I_ = lbound(test) to hbound(test);
        test[_I_] = test[_I_] - abc[_I_];
        test[_I_] = (test[_I_] < 0);
      end;
      tot_test = sum(of test[*]) ;	

      /*put _all_;*/
      if tot_test then point + 1;
      else
        do;
          do _I_ = lbound(instr) to hbound(instr);
            instr[_I_] = instr[_I_] - abc[_I_];
            new[_I_] + xyz[_I_];
          end;            
        end;
     
      stop = (point > nobs) or sum(of instr[*]) = 0;
  end;
  output;
  /*put;*/
  keep instr_:;
run;

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation

BarryP · Posted 01-23-2020 10:28 AM

@yabwon Thanks! I'll read the articles.

I think I just figured it out. It appears that when you have a variable (or an array) = something in parenthesis, it returns a 1 if true, and a 0 if false.

So, for example, lets assume I have a column called Name, and I want a column called IsSmith which will contain a 1 if Name = Smith and a 0 if it doesn't, then I could write the statement liek this:

IsSmith = (Name = 'Smith');

In Excel it would be written this way:

= if(Name = 'Smith',1,0)

unison · Posted 01-19-2020 09:08 PM

Take a look at this and let me know if this is aligned with what you're wanting:

data have;
	do num_items=1 to 50;
		output;
	end;
run;

data prices;
	input bulk bulk_price;
	datalines;
1 5
2 9
3 13
4 16
;
run;

proc sql noprint;
	select bulk_price, max(bulk) into 
		:price_list separated by ",", :num_prices from prices;
quit;

data want;
	set have;
	remainder=num_items;

	if num_items le &num_prices then
		price=1*scan("&price_list", num_items, ",");
	else
		do;
			array arr[&num_prices];
			do i=dim(arr) to 1 by -1;
				arr[i]=int(remainder/i);

				if remainder >=i*arr[i] and arr[i] > 0 then
					do;
						remainder=sum(remainder, -i*arr[i]);
						price=sum(price, arr[i]*scan("&price_list", i, ","));
					end;
			end;
		end;
	drop i arr: remainder;
run;

-unison

s_lassen · Posted 01-20-2020 04:49 AM

I assume you are looking for a general solution to this problem.

This is one of the cases where I think it may be advantageous to have a "broad" lookup table, e.g.:

data prices;
  input ItemCode n1-n5 price1-price5;
cards;
1 4 3 2 1 . 16 13 9 5 .
2 10 5 3 2 1 45.50 25 18 14 8
;run;

I put in an extra item with 5 prices, just for example. The ItemCode variable denotes the type of Item. Note that the numbers (n1-n5) and prices are arranged with the highest quantity first, and that the whole table is sorted by ItemCode.

Assuming your data is like this (again sorted by ItemCode, the InvoiceNo is just a hypothetical number identifying the invoice):

data sales;
input InvoiceNo ItemCode n;
cards;
13121 1 9
2133 1 14
13121 2 11
123123 2 25
;run;

You can calculate the "fair" price like this:

data TotalCost;
  merge sales(in=sale) prices;
  by ItemCode;
  if sale;
  array numbers n1-n5;
  array prices price1-price5;
  TotalCost=0;
  remain=n;
  do _N_=1 to dim(numbers) while(remain>0);
    TotalCost=TotalCost+int(remain/numbers(_N_))*prices(_N_);
    remain=remain-int(remain/numbers(_N_))*numbers(_N_);
    end;
  keep InvoiceNo ItemCode n TotalCost;
run;

If your quantities are whole numbers, and the last valid number in the numbers array is always 1, this should work.

The solution assumes that you cannot get a better price by splitting e.g. 5 items into 3+2 instead of 4+1.

KachiM · Posted 01-20-2020 05:43 AM

Hi @BarryP ,

Here is one way using array to hold the prices for 0, 1, 2, 3 and 4 items. There are several answers and I have not examined all of them. Hope I am not repeating earlier answers.

data want;
   array k[0:4] _temporary_ (0 5 9 13 16);
   do items = 1 to 100;
      mul4 = int(items/4);
      price = mul4 * k[4] + k[mod(items,4)];
      output;
   end;
drop mul4;
run;

BarryP · Posted 01-20-2020 10:27 AM

Hi all,

So as I'm working through this it seems like the key is that in order to treat the elements of a lookup table like variables, I need to put them into an array. Is that correct?

So, for example if I have a sale, 1 item for $4, or 4 items for $12, if I buy 5 items then I would need to take the price for 4 items, subtract the 4 items from the total, and then add the price for the 1 item, subtract that 1 item from the total, and now that I'm at zero items, I leave the loop.

Sorry, I've never worked with arrays in SAS before. So I'm assuming that to use an array like a lookup, its always going to look for the i'th element. So instead of thinking "lookup the element that corresponds with sale_scenario_4," I have to think, "lookup the 4th element in the array and make sure that you have the value for sale-scenario_4 there."

Is that right?

mkeintz · Posted 01-20-2020 11:18 AM

I see two ambiguous issues in the task you describe:

Single instrument issue: Your example shows pricing levels for every quantity from 1 to 4 (or more generally from 1 to Np - the maximum quantity discounted for product p). Because it covers every quantity in the 1:Np interval, it's a perfect fit for the array approach, and your description "lookup the 4th element in the array and make sure that you have the value for sale-scenario_4 there" is spot on.

But it may be that you have discount breaks not from 1 to 4, but at 1, 2, 5, and 10. Then the 4th element is not for quantity 4 but for quantity 10. One can work around this while still using the array approach.
Mixed-instrument translations. As I understand it, you have a lookup table in which various (non-exhaustive) mixes of instruments A,B,C,D will be replaced with some mix of new instruments W,X,Y,Z. This can be more of a problem, because if your customer doesn't have any of the exact translations you have in the lookup, you might not arrive at unique mix of new instruments.

For instance, what if your initial lookup table of {A,B} => {W,X} available translations are (admittedly contrived)
     {6,2} =>     {2,1}
     {2,6} =>     {1,2}
     {0,2} =>     {0,1}
     {2,0} =>     {1,0}

If the customer starts out with {A,B}={6,6}, then you could end up with either
    {W,X} = {2,1} + {0,1} + {0,1} == {2,3}     or
    {W,X} = {1,2} + {1,0} + {1,0} == {3,2}

You would need a way to decide between these two alternatives.
Also in your mixed-instrument lookup table, do those mixes of new instruments also have quantity discounts that do not match what you would produce by applying the instrument-by-instrument discounts in the new mix?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

BarryP · Posted 01-20-2020 01:12 PM

I think you are understanding me correctly.

In a nutshell I have ~100 unique old instrument combinations that will translate to new instrument combinations. If that was all I had, it would be an easy merge. I need to account for, however, combinations outside of that list (the list is a table that was provided to me from marketing).

So for example, I'll lay out the first 8 elements and codes that correspond to the instrument combinations:

Code Inst_A Inst_B
1-A1 1 0

1-A1B1 1 1
1-A2 2 0
1-A2B1 2 1
1-A3 3 0
1-A4 4 0
1-A5 5 0
1-A6 6 0

The translations are the following:

Code InstW InstX InstY InstZ

1-A1 1 0 0 1

1-A1B1 1 0 0 1
1-A2 0 1 0 2
1-A2B1 0 1 0 2
1-A3 0 1 2 0
1-A4 0 1 2 1
1-A5 0 1 3 0
1-A6 0 1 4 0

So if I have, for example, 7 Inst_A, the translation would be 1-A6 + 1-A1 which =

Code InstW InstX InstY InstZ

1-A6 + 1-A1 1 1 4 1

So I think I've resigned myself to the fact that the algorithms that tell you what InstA/InstB combinations are going to have to be hard coded, particularly when dealing with what order to do the calculations in, but I believe that simple 1 dimensional arrays can be used here. (???) Perhaps just calculate W, X, Y, & Z separately. The key here is that I've either done these as table merges or hard coding if/then statements with the math written into the code.

I want to be able to utilize an imported excel table as much as possible. If needed, I can even add a column that just defines the i'th element in the lookup table.

BarryP · Posted 01-20-2020 01:14 PM

Note: Those are hypphens, not minus signs. So 1-A1 is to be read as "One dash A One"

Tom · Posted 01-20-2020 01:41 PM

I am still not sure I understand the problem you working with. It looks like your two tables are just explanations of what individual items each code represents. So in the first table 1-A6 means you want 6 A's. But in the new system 1-A6 means you want one X and 4 Ys. (Why do the two systems use the same codes to mean different things?)

Is the problem one of just converting codes into the list of items they represent?

Or do you also need to solve the much harder problem of then mapping that list of items into a list of codes from the new system that represents the same items?

mkeintz · Posted 01-20-2020 01:57 PM

@BarryP wrote:

I think you are understanding me correctly.

In a nutshell I have ~100 unique old instrument combinations that will translate to new instrument combinations. If that was all I had, it would be an easy merge. I need to account for, however, combinations outside of that list (the list is a table that was provided to me from marketing).

So for example, I'll lay out the first 8 elements and codes that correspond to the instrument combinations:

Code Inst_A Inst_B
1-A1 1 0

1-A1B1 1 1
1-A2 2 0
1-A2B1 2 1
1-A3 3 0
1-A4 4 0
1-A5 5 0
1-A6 6 0

The translations are the following:

Code InstW InstX InstY InstZ

1-A1 1 0 0 1

1-A1B1 1 0 0 1
1-A2 0 1 0 2
1-A2B1 0 1 0 2
1-A3 0 1 2 0
1-A4 0 1 2 1
1-A5 0 1 3 0
1-A6 0 1 4 0

So if I have, for example, 7 Inst_A, the translation would be 1-A6 + 1-A1 which =

Code InstW InstX InstY InstZ

1-A6 + 1-A1 1 1 4 1

So I think I've resigned myself to the fact that the algorithms that tell you what InstA/InstB combinations are going to have to be hard coded, particularly when dealing with what order to do the calculations in, but I believe that simple 1 dimensional arrays can be used here. (???) Perhaps just calculate W, X, Y, & Z separately. The key here is that I've either done these as table merges or hard coding if/then statements with the math written into the code.

I want to be able to utilize an imported excel table as much as possible. If needed, I can even add a column that just defines the i'th element in the lookup table.

First, I don't think your have to hard-code anything, IF … your lookup tables also include a code 1-A1, 1-B1, 1-C1, etc. for each of your original instruments A, B, C, etc.

But let's say your customer has 2A's and 2B's, And let's also say you have lookup codes 1-A1, 1-A2B1, 1-B1, and 1-A1B2, but not 1-A2B2. Then you have two possible solutions (1-A2B1+1-B1 or 1-A1B2+1-A1). What criteria would you want to use to chose between them?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Register Today!

SAS Training: Just a Click Away