BookmarkSubscribeRSS Feed
AaronEvans
Calcite | Level 5

Hi,

 

Suppose I have two tables:

Transactions:

PRODUCT TRAN_DATE
1       05JUL19
1       07JUL19
1       01AUG19
2       04APR19
2       21MAY19
2       16JUN19

 

Prices:

PRODUCT PRICE_DATE PRICE

1       01JAN19    10

1       06JUL19    11

2       01JAN19    1000

2       01MAY19    1100

2       01AUG19    1200

 

I want the price in effect as of the transaction date. So the price that corresponds to the maximum price_date that is less than or equal to the tran_date. What is the correct SQL to return a table like below?

PRODUCT TRAN_DATE PRICE_DATE PRICE
1       05JUL19   01JAN19    10
1       07JUL19   06JUL19    11
1       01AUG19   06JUL19    11
2       04APR19   01JAN19    1000
2       21MAY19   01MAY19    1100
2       16JUN19   01MAY19    1100

 

Thanks,

Aaron

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

Does it have to be an SQL solution?

AaronEvans
Calcite | Level 5
Hi- it does not have to be SQL.

I have about 500k records in my transaction table and 300k in the price table. I coded a brute force solution that involves joining every available product price to every transaction by product. From there I sort by dates and choose from valid dates. My brute force approach takes a long time to run and I thought there must be a more elegant solution that runs faster.

Thanks,
Aaron
PeterClemmensen
Tourmaline | Level 20

Still brute force but probably faster 

 

data Transactions;
input PRODUCT TRAN_DATE :date7.;
format TRAN_DATE date7.;
datalines;
1 05JUL19
1 07JUL19
1 01AUG19
2 04APR19
2 21MAY19
2 16JUN19
;

data Prices;
input PRODUCT PRICE_DATE :date7. PRICE;
format PRICE_DATE date7.;
datalines;
1 01JAN19 10
1 06JUL19 11
2 01JAN19 1000
2 01MAY19 1100
2 01AUG19 1200
;

data temp;
    set Prices;
    by PRODUCT;
    if not (last.PRODUCT) then do;
        p=_N_+1;
        set Prices(keep=PRICE_DATE rename=(PRICE_DATE=_PRICE_DATE)) point=p;
    end;
    else _PRICE_DATE=today();
run;

proc sql;
    create table want as
    select t.*, PRICE_DATE, PRICE
    from transactions t, temp p
    where t.PRODUCT=p.PRODUCT & PRICE_DATE le TRAN_DATE le _PRICE_DATE;
quit;
Shmuel
Garnet | Level 18

Create a new price table from its origin having: product ID, price_start_date, price_end_date, price.

Use next code to create it:

proc sort data=prices out=temp_pr;
        by product descending price_date;
run;

data new_prices;
  set temp_pr;
       by product;
           retain price_start_date price_end_date;
           drop price_date;
           if first.product then do;    /* current price */
              price_start_date = price_date;
              price_end_date = '31dec2999'd;  /* any high date */
              output;
         end; else do;
              if last.product then 
                 price_start_date = .;
			  else 
			     price_start_date = price_end_date +1;
              price_end_date = price_date;
			  output;
		 end;
run;

Now that you have both dates, starting and ending, its easier to be used by sql:

proc sql;
     create table wand as
     select a.*, p.price
    from transactions a
    left join new_prices
    on a.product = b.product and
    tran_date between price_start_date and price_end_date.
run;
andreas_lds
Jade | Level 19

Just an idea:

  • sort price by product and descending price_date
  • use two temporary arrays, load price and price_date into those arrays per product
  • use a loop to find the first price_date <= tran_date
  • retain the index and use that for further processing
Ksharp
Super User
data Transactions;
input PRODUCT TRAN_DATE :date7.;
format TRAN_DATE date7.;
datalines;
1 05JUL19
1 07JUL19
1 01AUG19
2 04APR19
2 21MAY19
2 16JUN19
;

data Prices;
input PRODUCT PRICE_DATE :date7. PRICE;
format PRICE_DATE date7.;
datalines;
1 01JAN19 10
1 06JUL19 11
2 01JAN19 1000
2 01MAY19 1100
2 01AUG19 1200
;


proc sql;
     create table want as
   select a.*,b.price_date, b.price
    from transactions a
    left join prices b
    on a.product = b.product and tran_date >= price_date
	group by a.product, a.tran_date
	having price_date=max(price_date);
run;
andreas_lds
Jade | Level 19

I discarded some of the ideas i had earlier. First two steps are for setting the size of the temp arrays. Performance has to be tested.

 

proc freq data=work.prices noprint order=freq;
   table product / nopercent out=work.ProductCount(keep=product count) ;
run;

data _null_;
   set work.ProductCount(obs=1);
   call symputx('maxElements', count);
run;


data want(keep=product tran_date price_date price);
   if 0 then set work.Transactions;
   array d[&maxElements.] _temporary_;
   array p[&maxElements.] _temporary_;

   call missing(of d[*]);
   call missing(of p[*]);

   do i = 1 to dim(d) until (last.product);
      set work.Prices;
      by product;

      d[i] = price_date;
      p[i] = price;
   end;

   c = i;
   s = 1;

   do until(last.product);
      set work.Transactions;
      by product;

      do i = s to c while(d[i] <= tran_date);
      end;

      i = i-1;
      s = i;
      price_date = d[i];
      price = p[i];

      output;
   end;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1668 views
  • 0 likes
  • 5 in conversation