Hi,
Suppose I have two tables:
Transactions:
PRODUCT TRAN_DATE
1 05JUL19
1 07JUL19
1 01AUG19
2 04APR19
2 21MAY19
2 16JUN19
Prices:
PRODUCT PRICE_DATE PRICE
1 01JAN19 10
1 06JUL19 11
2 01JAN19 1000
2 01MAY19 1100
2 01AUG19 1200
I want the price in effect as of the transaction date. So the price that corresponds to the maximum price_date that is less than or equal to the tran_date. What is the correct SQL to return a table like below?
PRODUCT TRAN_DATE PRICE_DATE PRICE
1 05JUL19 01JAN19 10
1 07JUL19 06JUL19 11
1 01AUG19 06JUL19 11
2 04APR19 01JAN19 1000
2 21MAY19 01MAY19 1100
2 16JUN19 01MAY19 1100
Thanks,
Aaron
Does it have to be an SQL solution?
Still brute force but probably faster
data Transactions;
input PRODUCT TRAN_DATE :date7.;
format TRAN_DATE date7.;
datalines;
1 05JUL19
1 07JUL19
1 01AUG19
2 04APR19
2 21MAY19
2 16JUN19
;
data Prices;
input PRODUCT PRICE_DATE :date7. PRICE;
format PRICE_DATE date7.;
datalines;
1 01JAN19 10
1 06JUL19 11
2 01JAN19 1000
2 01MAY19 1100
2 01AUG19 1200
;
data temp;
set Prices;
by PRODUCT;
if not (last.PRODUCT) then do;
p=_N_+1;
set Prices(keep=PRICE_DATE rename=(PRICE_DATE=_PRICE_DATE)) point=p;
end;
else _PRICE_DATE=today();
run;
proc sql;
create table want as
select t.*, PRICE_DATE, PRICE
from transactions t, temp p
where t.PRODUCT=p.PRODUCT & PRICE_DATE le TRAN_DATE le _PRICE_DATE;
quit;
Create a new price table from its origin having: product ID, price_start_date, price_end_date, price.
Use next code to create it:
proc sort data=prices out=temp_pr;
by product descending price_date;
run;
data new_prices;
set temp_pr;
by product;
retain price_start_date price_end_date;
drop price_date;
if first.product then do; /* current price */
price_start_date = price_date;
price_end_date = '31dec2999'd; /* any high date */
output;
end; else do;
if last.product then
price_start_date = .;
else
price_start_date = price_end_date +1;
price_end_date = price_date;
output;
end;
run;
Now that you have both dates, starting and ending, its easier to be used by sql:
proc sql;
create table wand as
select a.*, p.price
from transactions a
left join new_prices
on a.product = b.product and
tran_date between price_start_date and price_end_date.
run;
Just an idea:
data Transactions;
input PRODUCT TRAN_DATE :date7.;
format TRAN_DATE date7.;
datalines;
1 05JUL19
1 07JUL19
1 01AUG19
2 04APR19
2 21MAY19
2 16JUN19
;
data Prices;
input PRODUCT PRICE_DATE :date7. PRICE;
format PRICE_DATE date7.;
datalines;
1 01JAN19 10
1 06JUL19 11
2 01JAN19 1000
2 01MAY19 1100
2 01AUG19 1200
;
proc sql;
create table want as
select a.*,b.price_date, b.price
from transactions a
left join prices b
on a.product = b.product and tran_date >= price_date
group by a.product, a.tran_date
having price_date=max(price_date);
run;
I discarded some of the ideas i had earlier. First two steps are for setting the size of the temp arrays. Performance has to be tested.
proc freq data=work.prices noprint order=freq;
table product / nopercent out=work.ProductCount(keep=product count) ;
run;
data _null_;
set work.ProductCount(obs=1);
call symputx('maxElements', count);
run;
data want(keep=product tran_date price_date price);
if 0 then set work.Transactions;
array d[&maxElements.] _temporary_;
array p[&maxElements.] _temporary_;
call missing(of d[*]);
call missing(of p[*]);
do i = 1 to dim(d) until (last.product);
set work.Prices;
by product;
d[i] = price_date;
p[i] = price;
end;
c = i;
s = 1;
do until(last.product);
set work.Transactions;
by product;
do i = s to c while(d[i] <= tran_date);
end;
i = i-1;
s = i;
price_date = d[i];
price = p[i];
output;
end;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.