BookmarkSubscribeRSS Feed
mauimaggie
Calcite | Level 5

Desire to match merge two data sets.  The wrinkle is that  If there is no match using a primary match variable, I'd like to also try to match with a secondary match variable.  There is probably a simple way to do this but I haven't found it.  Here's an example:

 

Data A;

TICKER    Price

A             10

AA           11

S             21

B             50

 

Data B;

TICKER  SECOND_TICKER  OTHER_STUFF;

A                 A             Consumer

AA              ZZ            REIT

KMT          S              Consumer

B               BMT         Industrial

;

 

Want:  A match merge of A and B using ticker, but if no match on ticker, then try finding one by using "second_ticker" as the match variable.  This would be the ideal output:

 

Data C:

TICKER   SECOND_TICKER  PRICE  OTHER_STUFF

A                A                      10            Consumer

AA             zz                      11            REIT

S                S                      21           Consumer

B             BMT                    50            Industrial

 

So, there is no match between datasets A and B  for the observation with ticker=S when using ticker as the primary match variable.   However, there is a match for the no-match observation (ticker= S) between datasets A and B if we try again searching for matches using second_ticker in Dataset B and ticker in A

 

I can accomplish this in a messy way by defining temporary match variables and doing multiple passes through the data  - eventually combining partial results.  But, Is there an elegant solution, or am I missing something obvious (SAS v9.4)?

2 REPLIES 2
Reeza
Super User
Do both the joins using SQL and use COALESCE() to take the one that matches.

Untested, but something like this:

proc sql;
create table want as
select a.*, coalesce(t2.other_stuff, t3.other_stuff) as other_stuff
from have as a
left join secondTable as t2
on a.ticker=t2.ticker
left join secondTable as t3
on a.secondTicker=t3.ticker;
quit;
Astounding
PROC Star

An elegant, relatively fast approach would use two hash tables.  In one DATA step:

 

  1. Create a hash table from B that uses TICKER as the key and OTHER_STUFF as the data.
  2. Create another hash table from B that uses SECOND_TICKER as the key and OTHER_STUFF as the data.
  3. Read in an observation from data set A, and compare to the hash tables.  More specifically, see if there's a match in hash table #1.  If so, read in OTHER_STUFF.  If not, see if there is a match in hash table #2.  If so, read in OTHER_STUFF.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 898 views
  • 0 likes
  • 3 in conversation