BookmarkSubscribeRSS Feed
AaronEvans
Calcite | Level 5

Hi,

 

Suppose I have two tables:

Transactions:

PRODUCT TRAN_DATE
1       05JUL19
1       07JUL19
1       01AUG19
2       04APR19
2       21MAY19
2       16JUN19

 

Prices:

PRODUCT PRICE_DATE PRICE

1       01JAN19    10

1       06JUL19    11

2       01JAN19    1000

2       01MAY19    1100

2       01AUG19    1200

 

I want the price in effect as of the transaction date. So the price that corresponds to the maximum price_date that is less than or equal to the tran_date. What is the correct SQL to return a table like below?

PRODUCT TRAN_DATE PRICE_DATE PRICE
1       05JUL19   01JAN19    10
1       07JUL19   06JUL19    11
1       01AUG19   06JUL19    11
2       04APR19   01JAN19    1000
2       21MAY19   01MAY19    1100
2       16JUN19   01MAY19    1100

 

Thanks,

Aaron

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

Does it have to be an SQL solution?

AaronEvans
Calcite | Level 5
Hi- it does not have to be SQL.

I have about 500k records in my transaction table and 300k in the price table. I coded a brute force solution that involves joining every available product price to every transaction by product. From there I sort by dates and choose from valid dates. My brute force approach takes a long time to run and I thought there must be a more elegant solution that runs faster.

Thanks,
Aaron
PeterClemmensen
Tourmaline | Level 20

Still brute force but probably faster 

 

data Transactions;
input PRODUCT TRAN_DATE :date7.;
format TRAN_DATE date7.;
datalines;
1 05JUL19
1 07JUL19
1 01AUG19
2 04APR19
2 21MAY19
2 16JUN19
;

data Prices;
input PRODUCT PRICE_DATE :date7. PRICE;
format PRICE_DATE date7.;
datalines;
1 01JAN19 10
1 06JUL19 11
2 01JAN19 1000
2 01MAY19 1100
2 01AUG19 1200
;

data temp;
    set Prices;
    by PRODUCT;
    if not (last.PRODUCT) then do;
        p=_N_+1;
        set Prices(keep=PRICE_DATE rename=(PRICE_DATE=_PRICE_DATE)) point=p;
    end;
    else _PRICE_DATE=today();
run;

proc sql;
    create table want as
    select t.*, PRICE_DATE, PRICE
    from transactions t, temp p
    where t.PRODUCT=p.PRODUCT & PRICE_DATE le TRAN_DATE le _PRICE_DATE;
quit;
Shmuel
Garnet | Level 18

Create a new price table from its origin having: product ID, price_start_date, price_end_date, price.

Use next code to create it:

proc sort data=prices out=temp_pr;
        by product descending price_date;
run;

data new_prices;
  set temp_pr;
       by product;
           retain price_start_date price_end_date;
           drop price_date;
           if first.product then do;    /* current price */
              price_start_date = price_date;
              price_end_date = '31dec2999'd;  /* any high date */
              output;
         end; else do;
              if last.product then 
                 price_start_date = .;
			  else 
			     price_start_date = price_end_date +1;
              price_end_date = price_date;
			  output;
		 end;
run;

Now that you have both dates, starting and ending, its easier to be used by sql:

proc sql;
     create table wand as
     select a.*, p.price
    from transactions a
    left join new_prices
    on a.product = b.product and
    tran_date between price_start_date and price_end_date.
run;
andreas_lds
Jade | Level 19

Just an idea:

  • sort price by product and descending price_date
  • use two temporary arrays, load price and price_date into those arrays per product
  • use a loop to find the first price_date <= tran_date
  • retain the index and use that for further processing
Ksharp
Super User
data Transactions;
input PRODUCT TRAN_DATE :date7.;
format TRAN_DATE date7.;
datalines;
1 05JUL19
1 07JUL19
1 01AUG19
2 04APR19
2 21MAY19
2 16JUN19
;

data Prices;
input PRODUCT PRICE_DATE :date7. PRICE;
format PRICE_DATE date7.;
datalines;
1 01JAN19 10
1 06JUL19 11
2 01JAN19 1000
2 01MAY19 1100
2 01AUG19 1200
;


proc sql;
     create table want as
   select a.*,b.price_date, b.price
    from transactions a
    left join prices b
    on a.product = b.product and tran_date >= price_date
	group by a.product, a.tran_date
	having price_date=max(price_date);
run;
andreas_lds
Jade | Level 19

I discarded some of the ideas i had earlier. First two steps are for setting the size of the temp arrays. Performance has to be tested.

 

proc freq data=work.prices noprint order=freq;
   table product / nopercent out=work.ProductCount(keep=product count) ;
run;

data _null_;
   set work.ProductCount(obs=1);
   call symputx('maxElements', count);
run;


data want(keep=product tran_date price_date price);
   if 0 then set work.Transactions;
   array d[&maxElements.] _temporary_;
   array p[&maxElements.] _temporary_;

   call missing(of d[*]);
   call missing(of p[*]);

   do i = 1 to dim(d) until (last.product);
      set work.Prices;
      by product;

      d[i] = price_date;
      p[i] = price;
   end;

   c = i;
   s = 1;

   do until(last.product);
      set work.Transactions;
      by product;

      do i = s to c while(d[i] <= tran_date);
      end;

      i = i-1;
      s = i;
      price_date = d[i];
      price = p[i];

      output;
   end;
run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2284 views
  • 0 likes
  • 5 in conversation