I am having a problem in merging two datasets--COMPUSTAT annual and CRSP Monthly Stock file.
Example:
COMPUSTAT DATA:
gvkey datadate yr indfmt consol popsrc datafmt tic cusip
001000 31dec1974 1974 INDL C D STD AE.2 000032102
001004 31may1975 1974 INDL C D STD AIR 000361105
001005 31oct1974 1974 INDL C D STD ABA.2 000370106
001006 30jun1974 1974 INDL C D STD 1145B 000736108
001007 30sep1974 1974 INDL C D STD 4135B 000774109
CRSP DATA:
gvkey linkprim linktype lpermno lpermco iid datadate tic cusip
001000 P LU 25881 23369 01 31mar1974 AE.2 000032102
001000 P LU 25881 23369 01 28feb1974 AE.2 000032102
001000 P LU 25881 23369 01 31jan1974 AE.2 000032102
001004 P LU 54594 20000 01 31mar1974 AIR 000361105
the CRSP datasets is monthly whereas COMPUSTAT is annual. CRSP also has a YEAR variable.
Thank you for your step by step reply.
data CRSP_new;
set CRSP;
yr = YEAR(date);
run;
proc sql;
create table Mereged_table as
select a.* , b.*
from COMPUSTAT a
inner join CRSP_new b
on a.gvkey = b.gvkey and a.yr= b.yr;
quit;
Hope I got your problem right.
Step1. in CRSP dataset, create new variable yr that would hold value of year.
Step2. Merge COMPUSTAT and CRSP_new on two columns gvkey and yr.
Above code it is not tested.
data CRSP_new;
set CRSP;
yr = YEAR(date);
run;
proc sql;
create table Mereged_table as
select a.* , b.*
from COMPUSTAT a
inner join CRSP_new b
on a.gvkey = b.gvkey and a.yr= b.yr;
quit;
Hope I got your problem right.
Step1. in CRSP dataset, create new variable yr that would hold value of year.
Step2. Merge COMPUSTAT and CRSP_new on two columns gvkey and yr.
Above code it is not tested.
It would be great if you list what should your output look like.
@RahulG wrote:
data CRSP_new; set CRSP; yr = YEAR(date); run; proc sql; create table Mereged_table as select a.* , b.* from COMPUSTAT a inner join CRSP_new b on a.gvkey = b.gvkey and a.yr= b.yr; quit;
Hope I got your problem right.
Step1. in CRSP dataset, create new variable yr that would hold value of year.
Step2. Merge COMPUSTAT and CRSP_new on two columns gvkey and yr.
Above code it is not tested.
@RahulG In this case it might be easier to do the calculation in the join condition?
a.yr=year(b.date)
Thank you. It worked.
Thank you.
How can you get GVKEY in CRSP ?
I believe these data are here, but you need to have access via your institution as well.
https://wrds-web.wharton.upenn.edu/wrds/
It is not allowed to distribute these databases. It will be a problem for Author if he/she is using some data which is not subscribed by their institutions. The most likely way is to collaborate (joint research) with someone from the institution who has access.
They have a trial access, did you try that approach? Not sure if that would let you take data out. And how can you site the source if you don't have access?
I think this is a problem you should discuss with your thesis supervisor.
@Darrel wrote:
Is there a way someone can help me get access?
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.