Dear all,
I am using sas 9.4 and trying to merge a M&A dataset (DATASET B) with a 60,000 observation with a compustat dataset (DATASET A) with over 800,000 observation. The B dataset has 2 unique identifiers. for the US firm it has CUSIP and for non-us firm it has sedol, and year that I can use for merge. on the other hand, the A dataset has both the identifiers and the year. I was wondering if there is any way that I can merge the two datasets (in a way that dataset A is a left join that keeps all the variables) based on identifier 1 OR identifier 2. meaning use identifier 1 for merge if the B has identifier 1 OR use identifier 2 if the B has identifier 1.
This is what I originally used and It absolutely did not work since half of the data has cusip and the other has sedol. Perhaps the data step be a better choice for these type of merging.
proc sql;
create table mna5 as select unique *
from comp as a left join mna as b
on a.cusip=b.cusip and a.sedol=b.sedol and a.year=b.year;
quit;
Thanks
A couple of short examples of the data that demonstrate the problem would be helpful. A data step to create an example using the variables of interest and the cases that you have and the final desired result would be optimal.
I would be tempted to create a set for matches on cuspid, another from matches on sedol, combine the results and remove duplicates as a crude easy to understand approach.
Then look for the ones that had no matches.
Would this work?
on (a.cusip=b.cusip or a.sedol=b.sedol) and a.year=b.year;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.