I want to merge two tables using two variables.
first by portfolio level identifiers and by date variable.
datasets look like below
dataset "hex" :
data hex;
input crsp_portno report_dt permno;
datalines;
1000001 20080229 123
1000001 20080229 124
1000001 20080331 123
1000001 20080331 124
1000001 20080331 125
1000001 20080430 129
1000001 20080531 130
1000002 20080229 123
1000002 20080229 126
1000002 20080229 127
;
run;
dataset "portno"
data portno;
input crsp_fundno crsp_portno begdt enddt crsp_cl_grp
4273 1000001 20080229 20080331 2000866
4274 1000001 20080229 20080331 2000866
4275 1000001 20080229 20080331 2000866
4276 1000001 20080229 20080331 2000866
4277 1000001 20080430 20080531 2000971
;
run;
What I want to do is to merge crsp_cl_grp variable into dataset "hex"
First, dataset "portno" should be matched with dataset "hex" by crsp_portno from begdt (beginning date) to enddt (end date).
for example, crsp_portno=1000001 from 20080229 to 20080331 in dataset "hex"
should be matched with crsp_portno=1000001 with begdt=20080229 and enddt=20080331 with crsp_cl_grp=2000866
in this situation, how do I merge two datasets?
Since you have a many-to-many relationship, you will need proc sql:
data hex;
input crsp_portno report_dt :yymmdd8. permno;
format report_dt yymmddd10.;
datalines;
1000001 20080229 123
1000001 20080229 124
1000001 20080331 123
1000001 20080331 124
1000001 20080331 125
1000001 20080430 129
1000001 20080531 130
1000002 20080229 123
1000002 20080229 126
1000002 20080229 127
;
run;
data portno;
input crsp_fundno crsp_portno begdt :yymmdd8. enddt :yymmdd8. crsp_cl_grp;
format begdt enddt yymmddd10.;
datalines;
4273 1000001 20080229 20080331 2000866
4274 1000001 20080229 20080331 2000866
4275 1000001 20080229 20080331 2000866
4276 1000001 20080229 20080331 2000866
4277 1000001 20080430 20080531 2000971
;
run;
proc sql;
create table want as
select a.*, b.crsp_fundno, b.crsp_cl_grp
from hex a, portno b
where
a.crsp_portno = b.crsp_portno and b.begdt <= a.report_dt <= b.enddt
;
quit;
Note that I stored your dates as SAS date values, making future use much easier.
Since you have a many-to-many relationship, you will need proc sql:
data hex;
input crsp_portno report_dt :yymmdd8. permno;
format report_dt yymmddd10.;
datalines;
1000001 20080229 123
1000001 20080229 124
1000001 20080331 123
1000001 20080331 124
1000001 20080331 125
1000001 20080430 129
1000001 20080531 130
1000002 20080229 123
1000002 20080229 126
1000002 20080229 127
;
run;
data portno;
input crsp_fundno crsp_portno begdt :yymmdd8. enddt :yymmdd8. crsp_cl_grp;
format begdt enddt yymmddd10.;
datalines;
4273 1000001 20080229 20080331 2000866
4274 1000001 20080229 20080331 2000866
4275 1000001 20080229 20080331 2000866
4276 1000001 20080229 20080331 2000866
4277 1000001 20080430 20080531 2000971
;
run;
proc sql;
create table want as
select a.*, b.crsp_fundno, b.crsp_cl_grp
from hex a, portno b
where
a.crsp_portno = b.crsp_portno and b.begdt <= a.report_dt <= b.enddt
;
quit;
Note that I stored your dates as SAS date values, making future use much easier.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.