BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
UniversitySas
Quartz | Level 8

I have two tables that I want to join, call them ID_table and merge_table.

data Id_table;
   input Name $ Parent_name $  var1 ;
   datalines;
ABC  ABC_Co 28.6
HLE  ABC_Co 42.1
Ggl_inc google 18.8
APPLE APPL_Inc 123.4
;
run;

data merge_table;
   input Comp_Name $ var2 ;
   datalines;
ABC_Co 110
Ggl_inc 105.3
APPLE 112.1
XYZ 101
;
run;

I want sas to merge on either: Name = comp_name, or Parent_name = comp_name.

If I do this:

PROC SQL;
	CREATE TABLE merge_test AS
	SELECT l.*, r.*
	FROM Id_table AS l LEFT JOIN merge_table AS r
	ON	 (l.Name = r.Comp_Name or l.Parent_name = r.Comp_Name);
QUIT;

It works. However, my actual data set has 582k obs and 300k obs in both tables, and when I run the above command it seems to take forever and SAS times out.

 

Is there a more efficient way to program this?

 

Furthermore, I still want to keep unmerged matches in my data set, but SAS doesn't seem to be doing so in the above test sample (the values for comp_name = XYZ from merge_table are dropped). Is there  a way it is included?

 

thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Maxim 2: Read the Log.

The SQL procedure notifies you that it has to do cartesian joins, which is often a bad sign, performancewise.

Split your join into two, and concatenate the results:

proc sql;
create table merge_test as
  select l.*, r.*
  from Id_table as l left join merge_table as r
  on l.Name = r.Comp_Name
  where not missing(r.comp_name)
union
  select l.*, r.*
  from Id_table as l left join merge_table as r
  on l.Parent_name = r.Comp_Name
  where not missing(r.comp_name)
;
quit;

View solution in original post

1 REPLY 1
Kurt_Bremser
Super User

Maxim 2: Read the Log.

The SQL procedure notifies you that it has to do cartesian joins, which is often a bad sign, performancewise.

Split your join into two, and concatenate the results:

proc sql;
create table merge_test as
  select l.*, r.*
  from Id_table as l left join merge_table as r
  on l.Name = r.Comp_Name
  where not missing(r.comp_name)
union
  select l.*, r.*
  from Id_table as l left join merge_table as r
  on l.Parent_name = r.Comp_Name
  where not missing(r.comp_name)
;
quit;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 562 views
  • 2 likes
  • 2 in conversation