I have two tables that I want to join, call them ID_table and merge_table.
data Id_table;
input Name $ Parent_name $ var1 ;
datalines;
ABC ABC_Co 28.6
HLE ABC_Co 42.1
Ggl_inc google 18.8
APPLE APPL_Inc 123.4
;
run;
data merge_table;
input Comp_Name $ var2 ;
datalines;
ABC_Co 110
Ggl_inc 105.3
APPLE 112.1
XYZ 101
;
run;
I want sas to merge on either: Name = comp_name, or Parent_name = comp_name.
If I do this:
PROC SQL;
CREATE TABLE merge_test AS
SELECT l.*, r.*
FROM Id_table AS l LEFT JOIN merge_table AS r
ON (l.Name = r.Comp_Name or l.Parent_name = r.Comp_Name);
QUIT;
It works. However, my actual data set has 582k obs and 300k obs in both tables, and when I run the above command it seems to take forever and SAS times out.
Is there a more efficient way to program this?
Furthermore, I still want to keep unmerged matches in my data set, but SAS doesn't seem to be doing so in the above test sample (the values for comp_name = XYZ from merge_table are dropped). Is there a way it is included?
thanks!
Maxim 2: Read the Log.
The SQL procedure notifies you that it has to do cartesian joins, which is often a bad sign, performancewise.
Split your join into two, and concatenate the results:
proc sql;
create table merge_test as
select l.*, r.*
from Id_table as l left join merge_table as r
on l.Name = r.Comp_Name
where not missing(r.comp_name)
union
select l.*, r.*
from Id_table as l left join merge_table as r
on l.Parent_name = r.Comp_Name
where not missing(r.comp_name)
;
quit;
Maxim 2: Read the Log.
The SQL procedure notifies you that it has to do cartesian joins, which is often a bad sign, performancewise.
Split your join into two, and concatenate the results:
proc sql;
create table merge_test as
select l.*, r.*
from Id_table as l left join merge_table as r
on l.Name = r.Comp_Name
where not missing(r.comp_name)
union
select l.*, r.*
from Id_table as l left join merge_table as r
on l.Parent_name = r.Comp_Name
where not missing(r.comp_name)
;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.