I want to merge two tables using two variables.
first by portfolio level identifiers and by date variable.
datasets look like below
dataset "hex" :
data hex;
input crsp_portno report_dt permno;
datalines;
1000001 20080229 123
1000001 20080229 124
1000001 20080331 123
1000001 20080331 124
1000001 20080331 125
1000001 20080430 129
1000001 20080531 130
1000002 20080229 123
1000002 20080229 126
1000002 20080229 127
;
run;
dataset "portno"
data portno;
input crsp_fundno crsp_portno begdt enddt crsp_cl_grp
4273 1000001 20080229 20080331 2000866
4274 1000001 20080229 20080331 2000866
4275 1000001 20080229 20080331 2000866
4276 1000001 20080229 20080331 2000866
4277 1000001 20080430 20080531 2000971
;
run;
What I want to do is to merge crsp_cl_grp variable into dataset "hex"
First, dataset "portno" should be matched with dataset "hex" by crsp_portno from begdt (beginning date) to enddt (end date).
for example, crsp_portno=1000001 from 20080229 to 20080331 in dataset "hex"
should be matched with crsp_portno=1000001 with begdt=20080229 and enddt=20080331 with crsp_cl_grp=2000866
in this situation, how do I merge two datasets?
Since you have a many-to-many relationship, you will need proc sql:
data hex;
input crsp_portno report_dt :yymmdd8. permno;
format report_dt yymmddd10.;
datalines;
1000001 20080229 123
1000001 20080229 124
1000001 20080331 123
1000001 20080331 124
1000001 20080331 125
1000001 20080430 129
1000001 20080531 130
1000002 20080229 123
1000002 20080229 126
1000002 20080229 127
;
run;
data portno;
input crsp_fundno crsp_portno begdt :yymmdd8. enddt :yymmdd8. crsp_cl_grp;
format begdt enddt yymmddd10.;
datalines;
4273 1000001 20080229 20080331 2000866
4274 1000001 20080229 20080331 2000866
4275 1000001 20080229 20080331 2000866
4276 1000001 20080229 20080331 2000866
4277 1000001 20080430 20080531 2000971
;
run;
proc sql;
create table want as
select a.*, b.crsp_fundno, b.crsp_cl_grp
from hex a, portno b
where
a.crsp_portno = b.crsp_portno and b.begdt <= a.report_dt <= b.enddt
;
quit;
Note that I stored your dates as SAS date values, making future use much easier.
Since you have a many-to-many relationship, you will need proc sql:
data hex;
input crsp_portno report_dt :yymmdd8. permno;
format report_dt yymmddd10.;
datalines;
1000001 20080229 123
1000001 20080229 124
1000001 20080331 123
1000001 20080331 124
1000001 20080331 125
1000001 20080430 129
1000001 20080531 130
1000002 20080229 123
1000002 20080229 126
1000002 20080229 127
;
run;
data portno;
input crsp_fundno crsp_portno begdt :yymmdd8. enddt :yymmdd8. crsp_cl_grp;
format begdt enddt yymmddd10.;
datalines;
4273 1000001 20080229 20080331 2000866
4274 1000001 20080229 20080331 2000866
4275 1000001 20080229 20080331 2000866
4276 1000001 20080229 20080331 2000866
4277 1000001 20080430 20080531 2000971
;
run;
proc sql;
create table want as
select a.*, b.crsp_fundno, b.crsp_cl_grp
from hex a, portno b
where
a.crsp_portno = b.crsp_portno and b.begdt <= a.report_dt <= b.enddt
;
quit;
Note that I stored your dates as SAS date values, making future use much easier.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.