BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
UniversitySas
Quartz | Level 8

I have two tables that I want to join, call them ID_table and merge_table.

data Id_table;
   input Name $ Parent_name $  var1 ;
   datalines;
ABC  ABC_Co 28.6
HLE  ABC_Co 42.1
Ggl_inc google 18.8
APPLE APPL_Inc 123.4
;
run;

data merge_table;
   input Comp_Name $ var2 ;
   datalines;
ABC_Co 110
Ggl_inc 105.3
APPLE 112.1
XYZ 101
;
run;

I want sas to merge on either: Name = comp_name, or Parent_name = comp_name.

If I do this:

PROC SQL;
	CREATE TABLE merge_test AS
	SELECT l.*, r.*
	FROM Id_table AS l LEFT JOIN merge_table AS r
	ON	 (l.Name = r.Comp_Name or l.Parent_name = r.Comp_Name);
QUIT;

It works. However, my actual data set has 582k obs and 300k obs in both tables, and when I run the above command it seems to take forever and SAS times out.

 

Is there a more efficient way to program this?

 

Furthermore, I still want to keep unmerged matches in my data set, but SAS doesn't seem to be doing so in the above test sample (the values for comp_name = XYZ from merge_table are dropped). Is there  a way it is included?

 

thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Maxim 2: Read the Log.

The SQL procedure notifies you that it has to do cartesian joins, which is often a bad sign, performancewise.

Split your join into two, and concatenate the results:

proc sql;
create table merge_test as
  select l.*, r.*
  from Id_table as l left join merge_table as r
  on l.Name = r.Comp_Name
  where not missing(r.comp_name)
union
  select l.*, r.*
  from Id_table as l left join merge_table as r
  on l.Parent_name = r.Comp_Name
  where not missing(r.comp_name)
;
quit;

View solution in original post

1 REPLY 1
Kurt_Bremser
Super User

Maxim 2: Read the Log.

The SQL procedure notifies you that it has to do cartesian joins, which is often a bad sign, performancewise.

Split your join into two, and concatenate the results:

proc sql;
create table merge_test as
  select l.*, r.*
  from Id_table as l left join merge_table as r
  on l.Name = r.Comp_Name
  where not missing(r.comp_name)
union
  select l.*, r.*
  from Id_table as l left join merge_table as r
  on l.Parent_name = r.Comp_Name
  where not missing(r.comp_name)
;
quit;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 607 views
  • 2 likes
  • 2 in conversation