Hi,
I need to do Hash joins to get the following result by joining B to A
Dataset: A | |||
ID | Name | Sex | |
1 | Jay | M | |
2 | Jee | M | |
3 | Min | F | |
Dataset: B | |||
ID | Sal | ||
1 | 55 | ||
2 | 65 | ||
3 | 75 | ||
1 | 55 | ||
2 | 65 | ||
3 | 75 | ||
1 | 55 | ||
2 | 65 | ||
3 | 75 | ||
Result | |||
ID | Name | Sex | Sal |
1 | Jay | M | 55 |
2 | Jee | M | 65 |
3 | Min | F | 75 |
Why do you have to do a hash join? It is a lot easier using SQL or a DATA step merge.
It shouldn't be hard to propose a solution once we understand what you really have and need. Because your narrative (left join) doesn't match the desired result you've posted, it's a bit hard to propose anything.
Can you please answer @PGStats question and then "amend" your question so we can understand what needs to be done?
I don't believe that an SQL join or a DATA step merge is too "process constrained" to handle 7M records. I've got DATA step merges that handle close to 200M records in a reasonable time frame. My approach is to use the simplest method that still processes efficiently. Hash joins have their place and I use them myself, but only for special cases where they offer a major advantage.
If you have tried other techniques please post the results including run times so we help you choose the best approach.
You posted result is not what a left join would produce. It would match every A record with each B record with the same ID, produce a new record for each match. Result would be 9 records.
As @mkeintz said, the example you gave is not a left join. Given the following data, what would be the result? Knowing what your operational constraints are would also help us to propose an appropriate solution.
Dataset: A ID Name Sex 1 Jay M 2 Jee M 3 Min F
4 Xxx M Dataset: B ID Sal 1 55 2 65 3 75 1 155 2 165 3 175 1 255 2 365 3 375 5 500 Result ? ID Name Sex
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.