BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
UniversitySas
Quartz | Level 8

I have two tables that I want to join, call them ID_table and merge_table.

data Id_table;
   input Name $ Parent_name $  var1 ;
   datalines;
ABC  ABC_Co 28.6
HLE  ABC_Co 42.1
Ggl_inc google 18.8
APPLE APPL_Inc 123.4
;
run;

data merge_table;
   input Comp_Name $ var2 ;
   datalines;
ABC_Co 110
Ggl_inc 105.3
APPLE 112.1
XYZ 101
;
run;

I want sas to merge on either: Name = comp_name, or Parent_name = comp_name.

If I do this:

PROC SQL;
	CREATE TABLE merge_test AS
	SELECT l.*, r.*
	FROM Id_table AS l LEFT JOIN merge_table AS r
	ON	 (l.Name = r.Comp_Name or l.Parent_name = r.Comp_Name);
QUIT;

It works. However, my actual data set has 582k obs and 300k obs in both tables, and when I run the above command it seems to take forever and SAS times out.

 

Is there a more efficient way to program this?

 

Furthermore, I still want to keep unmerged matches in my data set, but SAS doesn't seem to be doing so in the above test sample (the values for comp_name = XYZ from merge_table are dropped). Is there  a way it is included?

 

thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Maxim 2: Read the Log.

The SQL procedure notifies you that it has to do cartesian joins, which is often a bad sign, performancewise.

Split your join into two, and concatenate the results:

proc sql;
create table merge_test as
  select l.*, r.*
  from Id_table as l left join merge_table as r
  on l.Name = r.Comp_Name
  where not missing(r.comp_name)
union
  select l.*, r.*
  from Id_table as l left join merge_table as r
  on l.Parent_name = r.Comp_Name
  where not missing(r.comp_name)
;
quit;

View solution in original post

1 REPLY 1
Kurt_Bremser
Super User

Maxim 2: Read the Log.

The SQL procedure notifies you that it has to do cartesian joins, which is often a bad sign, performancewise.

Split your join into two, and concatenate the results:

proc sql;
create table merge_test as
  select l.*, r.*
  from Id_table as l left join merge_table as r
  on l.Name = r.Comp_Name
  where not missing(r.comp_name)
union
  select l.*, r.*
  from Id_table as l left join merge_table as r
  on l.Parent_name = r.Comp_Name
  where not missing(r.comp_name)
;
quit;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 962 views
  • 2 likes
  • 2 in conversation