Hi,
Is there any logic to get the previous month's value when a condition matches using two different data sets? Please see below data sets sample.
Dataset1:
Var_A | Var_B |
1 | 01Feb18 |
2 | 15Mar18 |
3 | 13Mar18 |
4 | 21Apr18 |
5 | 11Jul18 |
6 | 21Apr18 |
7 | 17May18 |
Dataset2:
Var_A | Var_C | Int Rate |
1 | 01Jan18 | 5.20% |
1 | 01Feb18 | 6.20% |
2 | 15Dec17 | 1.50% |
2 | 15Jan18 | 2.50% |
2 | 15Feb18 | 3.50% |
2 | 15Mar18 | 4.50% |
3 | 20Mar18 | 5.50% |
3 | 20Apr18 | 6.50% |
3 | 20May18 | 7.50% |
4 | 10Apr18 | 2.60% |
4 | 10May18 | 3.60% |
5 | 11Jun18 | 8.80% |
5 | 11Jul18 | 9.80% |
6 | 21Apr18 | 5.70% |
7 | 30Oct18 | 4.60% |
7 | 30Nov18 | 6.60% |
Desired Output:
Var_A | Var_B | Int Rate |
1 | 01Feb18 | 5.20% |
2 | 15Mar18 | 3.50% |
3 | 13Mar18 | |
4 | 21Apr18 | |
5 | 11Jul18 | 8.80% |
6 | 21Apr18 | 5.70% |
7 | 17May18 |
Why is the case for var_a=6 yielding int_rate=5.70% in you desired dataset? Both dataset1 and datasest2 have only one date (21apr18) for var_a=6. So it has no previous month, which by your specification means resulting int_rate should be missing, no?
Assuming my speculation is correct, then this code produces what you want (edited revision):
data dataset1;
input Var_A Var_B :date7.;
format var_b date9.;
datalines;
1 01Feb18
2 15Mar18
3 13Mar18
4 21Apr18
5 11Jul18
6 21Apr18
7 17May18
run;
data Dataset2;
input Var_A Var_C :date7. Int_Rate :percent5.2 ;
format var_c date9. int_rate percent7.2 ;
datalines;
1 01Jan18 5.20%
1 01Feb18 6.20%
2 15Dec17 1.50%
2 15Jan18 2.50%
2 15Feb18 3.50%
2 15Mar18 4.50%
3 20Mar18 5.50%
3 20Apr18 6.50%
3 20May18 7.50%
4 10Apr18 2.60%
4 10May18 3.60%
5 11Jun18 8.80%
5 11Jul18 9.80%
6 21Apr18 5.70%
7 30Oct18 4.60%
7 30Nov18 6.60%
run;
data want (keep=var_a var_b int_rate);
merge dataset1 (in=in1)
dataset2 (in=in2 keep=var_a var_c rename=(var_c=var_b));
by var_a var_b;
if in2 then do;
set dataset2 (keep=int_rate);
int_rate=lag(int_rate);
end;
if first.var_a then int_rate=.;
if in1;
run;
This code uses a conditional SET statement with a MERGE statement, which is a good technique for doing LOCF (last observation carried forward) for multiple datasets representing irregular time series.
Post test data in the form of a datastep. As such this is untested:
data want; set dataset1 (in=a) dataset2 (in=b rename=(intrate=irate)); ds=ifn(a,1,2); run; proc sort data=want; by var_a var_b; run; data want; if ds=1 then do; intrate=lag(irate); output; end; run;
Well it should resolve to if row comes from table aliased as a, then set to 1 else, set to 2. Maybe the ifn function doesn't like aliases,
data want; set dataset1 (in=a) dataset2 (in=b rename=(intrate=irate)); if a then ds=1;
else ds=2; run; proc sort data=want; by var_a var_b; run; data want; if ds=1 then do; intrate=lag(irate); output; end; run;
Not tested still.
Why is the case for var_a=6 yielding int_rate=5.70% in you desired dataset? Both dataset1 and datasest2 have only one date (21apr18) for var_a=6. So it has no previous month, which by your specification means resulting int_rate should be missing, no?
Assuming my speculation is correct, then this code produces what you want (edited revision):
data dataset1;
input Var_A Var_B :date7.;
format var_b date9.;
datalines;
1 01Feb18
2 15Mar18
3 13Mar18
4 21Apr18
5 11Jul18
6 21Apr18
7 17May18
run;
data Dataset2;
input Var_A Var_C :date7. Int_Rate :percent5.2 ;
format var_c date9. int_rate percent7.2 ;
datalines;
1 01Jan18 5.20%
1 01Feb18 6.20%
2 15Dec17 1.50%
2 15Jan18 2.50%
2 15Feb18 3.50%
2 15Mar18 4.50%
3 20Mar18 5.50%
3 20Apr18 6.50%
3 20May18 7.50%
4 10Apr18 2.60%
4 10May18 3.60%
5 11Jun18 8.80%
5 11Jul18 9.80%
6 21Apr18 5.70%
7 30Oct18 4.60%
7 30Nov18 6.60%
run;
data want (keep=var_a var_b int_rate);
merge dataset1 (in=in1)
dataset2 (in=in2 keep=var_a var_c rename=(var_c=var_b));
by var_a var_b;
if in2 then do;
set dataset2 (keep=int_rate);
int_rate=lag(int_rate);
end;
if first.var_a then int_rate=.;
if in1;
run;
This code uses a conditional SET statement with a MERGE statement, which is a good technique for doing LOCF (last observation carried forward) for multiple datasets representing irregular time series.
Here is a possible solution:
data data1;
input Var_A / Var_B date.;
format Var_B date9.;
cards;
1
01Feb18
2
15Mar18
3
13Mar18
4
21Apr18
5
11Jul18
6
21Apr18
7
17May18
;run;
Data data2;
input Var_A / Var_C date. / Int_Rate percent.;
format Var_C date9. Int_Rate percent7.2;
cards;
1
01Jan18
5.20%
1
01Feb18
6.20%
2
15Dec17
1.50%
2
15Jan18
2.50%
2
15Feb18
3.50%
2
15Mar18
4.50%
3
20Mar18
5.50%
3
20Apr18
6.50%
3
20May18
7.50%
4
10Apr18
2.60%
4
10May18
3.60%
5
11Jun18
8.80%
5
11Jul18
9.80%
6
21Apr18
5.70%
7
30Oct18
4.60%
7
30Nov18
6.60%
;run;
proc sql;
create index idx on data2(Var_A,Var_C);
quit;
data want;
set data1;
Var_C=intnx('month',Var_B,-1,'S');
set data2 key=idx/unique;
if _iorc_ then do;
Int_Rate=.;
_error_=0;
end;
drop Var_C;
run;
The program gives the same result as you indicated, except for Var_A=6 and Var_B=21APR18, where the data in the second table is not from the previous month, but from the same month (so Int_rate is set missing).
Although you got good solution here is another way using Left join with condition.
proc sql;
create table want as
select a.Var_A,a.Var_B,b.Int_Rate
from dataset1 a
left join dataset2 b
on (intnx('month',a.Var_B,0,'b')=intnx('month',b.Var_C,1,'b') and a.Var_A=b.Var_A)
;
quit;
data dataset1;
infile cards expandtabs;
input Var_A Var_B :date7.;
format var_b date9.;
datalines;
1 01Feb18
2 15Mar18
3 13Mar18
4 21Apr18
5 11Jul18
6 21Apr18
7 17May18
run;
data Dataset2;
infile cards expandtabs;
input Var_A Var_C :date7. Int_Rate :percent5.2 ;
format var_c date9. int_rate percent7.2 ;
datalines;
1 01Jan18 5.20%
1 01Feb18 6.20%
2 15Dec17 1.50%
2 15Jan18 2.50%
2 15Feb18 3.50%
2 15Mar18 4.50%
3 20Mar18 5.50%
3 20Apr18 6.50%
3 20May18 7.50%
4 10Apr18 2.60%
4 10May18 3.60%
5 11Jun18 8.80%
5 11Jul18 9.80%
6 21Apr18 5.70%
7 30Oct18 4.60%
7 30Nov18 6.60%
run;
data temp;
set dataset1;
var_c=intnx('month',Var_B,-1,'s');
run;
data want;
merge temp(in=ina) dataset2;
by Var_A Var_C;
if ina;
run;
proc print noobs;run;
Thank you all for your inputs.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.