BookmarkSubscribeRSS Feed
mauimaggie
Calcite | Level 5

Desire to match merge two data sets.  The wrinkle is that  If there is no match using a primary match variable, I'd like to also try to match with a secondary match variable.  There is probably a simple way to do this but I haven't found it.  Here's an example:

 

Data A;

TICKER    Price

A             10

AA           11

S             21

B             50

 

Data B;

TICKER  SECOND_TICKER  OTHER_STUFF;

A                 A             Consumer

AA              ZZ            REIT

KMT          S              Consumer

B               BMT         Industrial

;

 

Want:  A match merge of A and B using ticker, but if no match on ticker, then try finding one by using "second_ticker" as the match variable.  This would be the ideal output:

 

Data C:

TICKER   SECOND_TICKER  PRICE  OTHER_STUFF

A                A                      10            Consumer

AA             zz                      11            REIT

S                S                      21           Consumer

B             BMT                    50            Industrial

 

So, there is no match between datasets A and B  for the observation with ticker=S when using ticker as the primary match variable.   However, there is a match for the no-match observation (ticker= S) between datasets A and B if we try again searching for matches using second_ticker in Dataset B and ticker in A

 

I can accomplish this in a messy way by defining temporary match variables and doing multiple passes through the data  - eventually combining partial results.  But, Is there an elegant solution, or am I missing something obvious (SAS v9.4)?

2 REPLIES 2
Reeza
Super User
Do both the joins using SQL and use COALESCE() to take the one that matches.

Untested, but something like this:

proc sql;
create table want as
select a.*, coalesce(t2.other_stuff, t3.other_stuff) as other_stuff
from have as a
left join secondTable as t2
on a.ticker=t2.ticker
left join secondTable as t3
on a.secondTicker=t3.ticker;
quit;
Astounding
PROC Star

An elegant, relatively fast approach would use two hash tables.  In one DATA step:

 

  1. Create a hash table from B that uses TICKER as the key and OTHER_STUFF as the data.
  2. Create another hash table from B that uses SECOND_TICKER as the key and OTHER_STUFF as the data.
  3. Read in an observation from data set A, and compare to the hash tables.  More specifically, see if there's a match in hash table #1.  If so, read in OTHER_STUFF.  If not, see if there is a match in hash table #2.  If so, read in OTHER_STUFF.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 873 views
  • 0 likes
  • 3 in conversation