Hi,
Is there any logic to get the previous month's value when a condition matches using two different data sets? Please see below data sets sample.
Dataset1:
Var_A | Var_B |
1 | 01Feb18 |
2 | 15Mar18 |
3 | 13Mar18 |
4 | 21Apr18 |
5 | 11Jul18 |
6 | 21Apr18 |
7 | 17May18 |
Dataset2:
Var_A | Var_C | Int Rate |
1 | 01Jan18 | 5.20% |
1 | 01Feb18 | 6.20% |
2 | 15Dec17 | 1.50% |
2 | 15Jan18 | 2.50% |
2 | 15Feb18 | 3.50% |
2 | 15Mar18 | 4.50% |
3 | 20Mar18 | 5.50% |
3 | 20Apr18 | 6.50% |
3 | 20May18 | 7.50% |
4 | 10Apr18 | 2.60% |
4 | 10May18 | 3.60% |
5 | 11Jun18 | 8.80% |
5 | 11Jul18 | 9.80% |
6 | 21Apr18 | 5.70% |
7 | 30Oct18 | 4.60% |
7 | 30Nov18 | 6.60% |
Desired Output:
Var_A | Var_B | Int Rate |
1 | 01Feb18 | 5.20% |
2 | 15Mar18 | 3.50% |
3 | 13Mar18 | |
4 | 21Apr18 | |
5 | 11Jul18 | 8.80% |
6 | 21Apr18 | 5.70% |
7 | 17May18 |
Why is the case for var_a=6 yielding int_rate=5.70% in you desired dataset? Both dataset1 and datasest2 have only one date (21apr18) for var_a=6. So it has no previous month, which by your specification means resulting int_rate should be missing, no?
Assuming my speculation is correct, then this code produces what you want (edited revision):
data dataset1;
input Var_A Var_B :date7.;
format var_b date9.;
datalines;
1 01Feb18
2 15Mar18
3 13Mar18
4 21Apr18
5 11Jul18
6 21Apr18
7 17May18
run;
data Dataset2;
input Var_A Var_C :date7. Int_Rate :percent5.2 ;
format var_c date9. int_rate percent7.2 ;
datalines;
1 01Jan18 5.20%
1 01Feb18 6.20%
2 15Dec17 1.50%
2 15Jan18 2.50%
2 15Feb18 3.50%
2 15Mar18 4.50%
3 20Mar18 5.50%
3 20Apr18 6.50%
3 20May18 7.50%
4 10Apr18 2.60%
4 10May18 3.60%
5 11Jun18 8.80%
5 11Jul18 9.80%
6 21Apr18 5.70%
7 30Oct18 4.60%
7 30Nov18 6.60%
run;
data want (keep=var_a var_b int_rate);
merge dataset1 (in=in1)
dataset2 (in=in2 keep=var_a var_c rename=(var_c=var_b));
by var_a var_b;
if in2 then do;
set dataset2 (keep=int_rate);
int_rate=lag(int_rate);
end;
if first.var_a then int_rate=.;
if in1;
run;
This code uses a conditional SET statement with a MERGE statement, which is a good technique for doing LOCF (last observation carried forward) for multiple datasets representing irregular time series.
Post test data in the form of a datastep. As such this is untested:
data want; set dataset1 (in=a) dataset2 (in=b rename=(intrate=irate)); ds=ifn(a,1,2); run; proc sort data=want; by var_a var_b; run; data want; if ds=1 then do; intrate=lag(irate); output; end; run;
Well it should resolve to if row comes from table aliased as a, then set to 1 else, set to 2. Maybe the ifn function doesn't like aliases,
data want; set dataset1 (in=a) dataset2 (in=b rename=(intrate=irate)); if a then ds=1;
else ds=2; run; proc sort data=want; by var_a var_b; run; data want; if ds=1 then do; intrate=lag(irate); output; end; run;
Not tested still.
Why is the case for var_a=6 yielding int_rate=5.70% in you desired dataset? Both dataset1 and datasest2 have only one date (21apr18) for var_a=6. So it has no previous month, which by your specification means resulting int_rate should be missing, no?
Assuming my speculation is correct, then this code produces what you want (edited revision):
data dataset1;
input Var_A Var_B :date7.;
format var_b date9.;
datalines;
1 01Feb18
2 15Mar18
3 13Mar18
4 21Apr18
5 11Jul18
6 21Apr18
7 17May18
run;
data Dataset2;
input Var_A Var_C :date7. Int_Rate :percent5.2 ;
format var_c date9. int_rate percent7.2 ;
datalines;
1 01Jan18 5.20%
1 01Feb18 6.20%
2 15Dec17 1.50%
2 15Jan18 2.50%
2 15Feb18 3.50%
2 15Mar18 4.50%
3 20Mar18 5.50%
3 20Apr18 6.50%
3 20May18 7.50%
4 10Apr18 2.60%
4 10May18 3.60%
5 11Jun18 8.80%
5 11Jul18 9.80%
6 21Apr18 5.70%
7 30Oct18 4.60%
7 30Nov18 6.60%
run;
data want (keep=var_a var_b int_rate);
merge dataset1 (in=in1)
dataset2 (in=in2 keep=var_a var_c rename=(var_c=var_b));
by var_a var_b;
if in2 then do;
set dataset2 (keep=int_rate);
int_rate=lag(int_rate);
end;
if first.var_a then int_rate=.;
if in1;
run;
This code uses a conditional SET statement with a MERGE statement, which is a good technique for doing LOCF (last observation carried forward) for multiple datasets representing irregular time series.
Here is a possible solution:
data data1;
input Var_A / Var_B date.;
format Var_B date9.;
cards;
1
01Feb18
2
15Mar18
3
13Mar18
4
21Apr18
5
11Jul18
6
21Apr18
7
17May18
;run;
Data data2;
input Var_A / Var_C date. / Int_Rate percent.;
format Var_C date9. Int_Rate percent7.2;
cards;
1
01Jan18
5.20%
1
01Feb18
6.20%
2
15Dec17
1.50%
2
15Jan18
2.50%
2
15Feb18
3.50%
2
15Mar18
4.50%
3
20Mar18
5.50%
3
20Apr18
6.50%
3
20May18
7.50%
4
10Apr18
2.60%
4
10May18
3.60%
5
11Jun18
8.80%
5
11Jul18
9.80%
6
21Apr18
5.70%
7
30Oct18
4.60%
7
30Nov18
6.60%
;run;
proc sql;
create index idx on data2(Var_A,Var_C);
quit;
data want;
set data1;
Var_C=intnx('month',Var_B,-1,'S');
set data2 key=idx/unique;
if _iorc_ then do;
Int_Rate=.;
_error_=0;
end;
drop Var_C;
run;
The program gives the same result as you indicated, except for Var_A=6 and Var_B=21APR18, where the data in the second table is not from the previous month, but from the same month (so Int_rate is set missing).
Although you got good solution here is another way using Left join with condition.
proc sql;
create table want as
select a.Var_A,a.Var_B,b.Int_Rate
from dataset1 a
left join dataset2 b
on (intnx('month',a.Var_B,0,'b')=intnx('month',b.Var_C,1,'b') and a.Var_A=b.Var_A)
;
quit;
data dataset1;
infile cards expandtabs;
input Var_A Var_B :date7.;
format var_b date9.;
datalines;
1 01Feb18
2 15Mar18
3 13Mar18
4 21Apr18
5 11Jul18
6 21Apr18
7 17May18
run;
data Dataset2;
infile cards expandtabs;
input Var_A Var_C :date7. Int_Rate :percent5.2 ;
format var_c date9. int_rate percent7.2 ;
datalines;
1 01Jan18 5.20%
1 01Feb18 6.20%
2 15Dec17 1.50%
2 15Jan18 2.50%
2 15Feb18 3.50%
2 15Mar18 4.50%
3 20Mar18 5.50%
3 20Apr18 6.50%
3 20May18 7.50%
4 10Apr18 2.60%
4 10May18 3.60%
5 11Jun18 8.80%
5 11Jul18 9.80%
6 21Apr18 5.70%
7 30Oct18 4.60%
7 30Nov18 6.60%
run;
data temp;
set dataset1;
var_c=intnx('month',Var_B,-1,'s');
run;
data want;
merge temp(in=ina) dataset2;
by Var_A Var_C;
if ina;
run;
proc print noobs;run;
Thank you all for your inputs.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.