I have two base table.
table 1 | |||
Product | st_date | End_date | Price |
A | 01-Jan-18 | 06-Jan-18 | 10 |
B | 09-Feb-18 | 14-Feb-18 | 20 |
C | 06-May-18 | 11-May-18 | 30 |
D | 11-Jun-18 | 16-Jun-18 | 15 |
Table2 | ||
Product | Override date | Price |
A | 03-Jan-18 | 12 |
B | 09-Feb-18 | 18 |
D | 17-Jun-18 | 23 |
1st table is the price table for 4 products, it says price for that particular product for a given period.
The second table is overridden table where business decided to change the price between the given period from table1.
Output should be like this...
Output | |||
Product | st_date | End_date | Price |
A | 01-Jan-18 | 02-Jan-18 | 10 |
A | 03-Jan-18 | 06-Jan-18 | 12 |
B | 09-Feb-18 | 14-Feb-18 | 18 |
C | 06-May-18 | 11-May-18 | 30 |
D | 11-Jun-18 | 16-Jun-18 | 15 |
I am not considering to write this data first on day wise, then convert on range as this is huge table.
i just want to change the range(st_date & end_date based on table2)
data table1;
length
product $ 1
st_date end_date 8
price 8
;
informat st_date end_date date.;
format st_date end_date date9.;
input product st_date end_date price;
datalines;
A 01-Jan-18 06-Jan-18 10
B 09-Feb-18 14-Feb-18 20
C 06-May-18 11-May-18 30
D 11-Jun-18 16-Jun-18 15
;
run;
data table2;
length
product $ 1
override_date 8
price 8
;
informat override_date date.;
format override_date date9.;
input product override_date price;
datalines;
A 03-Jan-18 12
B 09-Feb-18 18
D 17-Jun-18 23
;
run;
data temp;
set table1;
n+1;
do date=st_date to end_date;
output;
end;
format date date9.;
drop st_date end_date;
run;
data temp1;
merge temp table2(in=inb rename=(price=_price override_date=date));
by product date;
in_b=inb;
run;
data temp2;
set temp1;
if n ne lag(n) or in_b then group+1;
run;
proc sql;
create table want as
select group,max(Product) as Product,
min(date) as st_date format=date9.,
max(date) as end_date format=date9.,
coalesce(max(_price),max(price)) as Price
from temp2
group by group;
quit;
How many entries for each product exist in table1?
How many overrides exist for each product in table2?
Can you post the data in usable form (data step using datalines)?
Answer for all the question is, it can be any number. This is smallest data which can help to understand the scenario.
This is dummy data.
The following steps is build upon some assumptions:
data table1;
length
product $ 1
st_date end_date 8
price 8
;
informat st_date end_date date.;
format st_date end_date date9.;
input product st_date end_date price;
datalines;
A 01-Jan-18 06-Jan-18 10
B 09-Feb-18 14-Feb-18 20
C 06-May-18 11-May-18 30
D 11-Jun-18 16-Jun-18 15
;
run;
data table2;
length
product $ 1
override_date 8
price 8
;
informat override_date date.;
format override_date date9.;
input product override_date price;
datalines;
A 03-Jan-18 12
B 09-Feb-18 18
D 17-Jun-18 23
;
run;
data work.updated;
merge work.table1 work.table2(rename=(price = new_price) in= needsUpdate);
by product;
if needsUpdate and st_date <= override_date <= end_date then do;
if st_date < override_date < end_date then do;
second_end = end_date;
end_date = override_date - 1;
output;
st_date = override_date;
end_date = second_end;
price = new_price;
output;
end;
else do;
if override_date = st_date then do;
st_date = override_date;
price = new_price;
output;
end;
end;
end;
else do;
output;
end;
drop new_price second_end override_date;
run;
Sorry, I think I oversimplified the example.
1) None of the record is fixed it can be any number.
2) 2 assumption is fine, I can manage it.
3)updated when override date>=st_date and override date<=end_date
What happens if your table 2 has multiple observations per product? For example, what id table 2 looks like this?
data table2;
input Product $ Override_date:date11. Price;
format Override_date date9.;
datalines;
A 03-Jan-18 12
A 04-Jan-18 13
B 09-Feb-18 18
B 11-Feb-18 30
D 17-Jun-18 23
;
there can be multiple override date between start_date and end_date, and if there is more then one override then .
table1 | |||
product | st_date | end_date | price |
A | 10/01/2018 | 15/01/2018 | 10 |
Table2 | ||
Product | Override | Price |
A | 10/01/2018 | 11 |
A | 11/01/2018 | 12 |
A | 13/01/2018 | 9 |
A | 14/01/2018 | 5 |
output will be like | |||
product | st_date | end_date | price |
A | 10/01/2018 | 10/01/2018 | 11 |
A | 11/01/2018 | 12/01/2018 | 12 |
A | 13/01/2018 | 13/01/2018 | 9 |
A | 14/01/2018 | 15/01/2018 | 5 |
Ok. And what about table1? Can multiple observations per product be present?
yes,everything can be multiple. for a same product it can be different date range also. the level of uniqueness is Product+start_date+end_date_price only.
Ok. There is quite a bit of logic to consider. See if this works for you. It returns the correct data for both of your posted pairs of example data sets table1 and table2
data table1;
input Product $ (st_date End_date)(:date11.) Price;
format st_date End_date date9.;
datalines;
A 10-Jan-18 15-Jan-18 10
;
data table2;
input Product $ Override_date:date11. Price;
format Override_date date9.;
datalines;
A 10-Jan-18 11
A 11-Jan-18 12
A 13-Jan-18 9
A 14-Jan-18 5
;
data want;
if 0 then set table2(rename=(Price=Newprice));
if _N_ = 1 then do;
declare hash h(dataset:'table2(rename=(Price=Newprice))', multidata:'Y');
h.defineKey('Product');
h.defineData(all:'Y');
h.defineDone();
end;
call missing(Override_date, Newprice);
set table1;
rc=h.find();
_tempOverride_date=Override_date;
_temp_st_date=st_date;
_tempEnd_date=End_date;
_tempprice=Newprice;
if rc ne 0 then output;
do c=1 by 1 while (rc=0);
h.has_next(result: r);
if r ne 0 then do;
Price=Newprice;
rc=h.find_next();
End_date=Override_date-1;
output;
st_date=End_date+1;
end;
if r=0 & c>1 then do;
st_date=End_date+1;
End_date=_tempEnd_date;
Price=Newprice;
output;
rc=h.find_next();
end;
if r=0 & c=1 then do;
if Override_date=st_date | Override_date=End_date then do;
st_date=_temp_st_date;
End_date=_tempEnd_date;
Price=_tempprice;
output;
end;
else if st_date < Override_date < End_date then do;
st_date=_temp_st_date;
End_date=Override_date-1;
output;
st_date=End_date+1;
End_date=_tempEnd_date;
Price=Newprice;
output;
end;
else output;
rc=h.find_next();
end;
end;
keep product st_date end_date price;
run;
proc print data=want;
run;
.. Also, for the first data sets posted, it returns
sorry, I am facing problem when I have more date for same product.
table 1 | |||
Product | st_date | End_date | Price |
A | 10-Jan-18 | 15-Jan-18 | 10 |
A | 15-Jan-18 | 30-Jan-18 | 10 |
B | 09-Feb-18 | 14-Feb-18 | 20 |
C | 06-May-18 | 11-May-18 | 30 |
D | 11-Jun-18 | 16-Jun-18 | 15 |
table 2
Product | Override_date | Price |
A | 10-Jan-18 | 11 |
A | 11-Jan-18 | 12 |
A | 13-Jan-18 | 9 |
A | 14-Jan-18 | 5 |
A | 03-Jan-18 | 12 |
B | 09-Feb-18 | 18 |
D | 17-Jun-18 | 23
|
output is coming like
Product | st_date | End_date | Price |
A | 10-Jan-18 | 10-Jan-18 | 11 |
A | 11-Jan-18 | 12-Jan-18 | 12 |
A | 13-Jan-18 | 13-Jan-18 | 9 |
A | 14-Jan-18 | 02-Jan-18 | 5 |
A | 03-Jan-18 | 15-Jan-18 | 12 |
A | 15-Jan-18 | 10-Jan-18 | 11 |
A | 11-Jan-18 | 12-Jan-18 | 12 |
A | 13-Jan-18 | 13-Jan-18 | 9 |
A | 14-Jan-18 | 02-Jan-18 | 5 |
A | 03-Jan-18 | 30-Jan-18 | 12 |
B | 09-Feb-18 | 14-Feb-18 | 18 |
C | 06-May-18 | 11-May-18 | 30 |
D | 11-Jun-18 | 16-Jun-18 | 15 |
highlighted is error.
Hi @Srigyan
I have a macro to handle problems like this. It is an interval merger that takes several intervals for the same ID in both datasets and accepts missing periods between intervals. It works with subgroups too. It is made to handle large data sets, so it is quite effecient.
It requires intervals in both data sets, and besides interval- and ID variables it cannot accept variables withe same name in both data sets, so something must be done to add end dates to the override table and chane the price variable name before calling the macro. And the documentation is written in danish. It is big, so I have attached it as a file. You are free to use it, and I will be happy to help with any questions.
It is used like this:
data work.a;
informat St_date End_date anydtdte9.;
input Product $ St_date End_date Price;
datalines;
A 17/12/2017 09/01/2018 8
A 10/01/2018 15/01/2018 10
;
run;
data work.b;
informat Override_date End_date anydtdte9.;
input Product $ Override_date End_date new_price;
datalines;
A 10/01/2018 10/01/2018 11
A 11/01/2018 12/01/2018 12
A 13/01/2018 13/01/2018 9
A 14/01/2018 31/12/9999 5
;
run;
%FletrensInterval(work.a, work.b, work.want, unita=Product, datefirsta=St_date,
datelasta=End_date, unitb=Product, datefirstb=Override_date, datelastb=End_date, join=left);
Hi @Srigyan
Here is a bit of code to transform the override dataset.
data work.b;
informat Override_date anydtdte9.;
input Product $ Override_date price;
datalines;
A 10/01/2018 11
A 11/01/2018 12
A 13/01/2018 9
A 14/01/2018 5
;
run;
data b (rename=(price=New_price));
format Override_date End_date ddmmyyd10.;
set b; by Product;
p = _N_+1;
if not last.Product then do;
set b(drop=price rename=(Override_date=end_date)) point=p;
end_date = end_date - 1;
end;
else end_date = '31dec9999'd;
run;
I am working on your example. I will let you know if this works for me.
As suggested i put end date for over_ride
table1 | |||
Product | st_date | End_date | Price |
A | 01-Jan-18 | 15-Jan-18 | 10 |
B | 09-Feb-18 | 14-Feb-18 | 20 |
C | 06-May-18 | 11-May-18 | 30 |
D | 11-Jun-18 | 16-Jun-18 | 15 |
table2 | |||
Product | Override_date | End_date | Price |
A | 03-Jan-18 | 15-Jan-18 | 12 |
A | 10-Jan-18 | 15-Jan-18 | 11 |
A | 11-Jan-18 | 15-Jan-18 | 12 |
A | 13-Jan-18 | 15-Jan-18 | 9 |
A | 14-Jan-18 | 15-Jan-18 | 5 |
B | 09-Feb-18 | 14-Feb-18 | 18 |
D | 17-Jun-18 | 16-Jun-18 | 23 |
i run the macro
%FletrensInterval(work.table1, work.table2, work.want, unita=Product, datefirsta=St_date,
datelasta=End_date, unitb=Product, datefirstb=Override_date, datelastb=End_date, join=left);
showing following error:
24 proc datasets lib=&lib nolist; delete &ds; quit;
24 ! %rensinterval(&tablea,qzbtableft0,datefirst=&datefirsta,datelast=&datelasta);
_
180
24 ! %rensinterval(&tableb,qzbtabright0,datefirst=&datefirstb,datelast=&datelastb); * sortering afhængig om byvars
ERROR 180-322: Statement is not valid or it is used out of proper order.
WARNING: Apparent invocation of macro RENSINTERVAL not resolved.
WARNING: Apparent invocation of macro RENSINTERVAL not resolved.
24 proc datasets lib=&lib nolist; delete &ds; quit;
24 ! %rensinterval(&tablea,qzbtableft0,datefirst=&datefirsta,datelast=&datelasta);
24 ! %rensinterval(&tableb,qzbtabright0,datefirst=&datefirstb,datelast=&datelastb); * sortering afhængig om byvars
_
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.