Merging two datasets by permno and date

Reply
New Contributor
Posts: 2

Merging two datasets by permno and date

Hi!

 

I am trying to merge a dataset which includes CRSP + IBES data with another IPO data set. The first datasets includes e.g. permno, ncusip, date, stock returns. However it has 50million rows since it includes daily stock returns. The other dataset includes only 7000 IPOs including PERMNO, NCUSIP, dates etc). I would like to merge these two data sets so that in the end I have a set where there is only those original 7000 IPOs + correct stock return from the first day of the IPO. What kind of code should I use to merge these two sets by the date and some identifier (e.g. permno)?

 

BR,

Voutsu

Super User
Posts: 8,590

Re: Merging two datasets by permno and date

[ Edited ]

Please supply example data. Use the macro from https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to convert subsets of data to datasteps, and post them here according to https://communities.sas.com/t5/help/faqpage/faq-category-id/posting?nobounce. Also supply an example of the expected result. If you already have code that you tried, include that also.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
New Contributor
Posts: 2

Re: Merging two datasets by permno and date

Posted in reply to KurtBremser

Alright, thanks!

 

I am not using University Edition so is there also instructions for 9.4 version?

Super User
Posts: 8,590

Re: Merging two datasets by permno and date

The macro is included in the download as a .sas file. Store that anywhere on your PC, and either open it with EG or the enhanced editor of Base SAS, or with a text editor and copy/paste the text to SAS Studio. Once you run the code, the macro is ready for use. The macro code itself contains a description of it's use in a section that is also written to the log when the macro is used with !HELP as a parameter.

If you want to use the macro repeatedly in the future, you can include it in your autocall library.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 12,148

Re: Merging two datasets by permno and date

Without data set names or variables it is very hard to be very specific. What it sounds like is you might be looking for some sort of Proc SQL left join

here is a skeleton

proc sql;
   create table want as
   select a.permno, a.ncusip, a.dateofselection, <repeat for other variables you want 
      from the set aliased as a>, b.var1, b.var2, <repeat for other variables wanted>
   from smallerdataset as a
        left join
        otherdataset  as b
        on a.permno=b.permno
        and a.ncusip=b.ncusip
        and a.dateofselection=b.dateofselection
        <repeat for other matching criteria>
   ;
quit;

you would want enough variables in the ON matches to reduce your matches from the larger dataset to a single record. If the variables have different names but the same values then use the matching variables. Note that the above comparisons assume that the case and spelling is the same (IBM will not match I.B.M. or GENERAL ELECTRIC will not match General Electric).

 

Trusted Advisor
Posts: 1,149

Re: Merging two datasets by permno and date

Since you've merged CRSP and IBES data, I bet your dataset (call it CRSP_IBES) is already sorted by permno/date.  If so, this is almost surely the fastest way to extract CRSP_IBES data for just the permno/dates of interest (from dataset we'll call IPODATA):

 

proc sort data=ipodata out=temp;
  by permno date;
run;

data want;
  merge crsp_ibes  temp (in=int);
  by permno date;
  if int;
run;

 

Question:  Since you are using public trade-exchange data as reported by CRSP, presumably the earliest record in CRSP_IBES for each PERMNO is an IPO date (or it's a date for new permnos from companies merging, splitting, or issuing new share classes).  So I would assume that the permno/dates in IPO_DATA is a proper subset of the collection of earliest records for each permno in CRSP_IBES.

 

If for any permno, you find that is not the case, (i.e. crsp_ibes has dates preceding the ipo date), you should investigate those permno's.  This assumes, of course, that the date in IPO_DATA is the IPO date related to the company issuing the stock identified by PERMNO.

Ask a Question
Discussion stats
  • 5 replies
  • 116 views
  • 0 likes
  • 4 in conversation