Hi,
I have 4 tables T1,T2,T3 and T4 and also a master table Z. The condition I have to join the tables to the master table is
T1.ID1=Z.ID OR T1.ID2=Z.ID OR T1.ID3=Z.ID OR T1.ID4=Z.ID
Is there any way to optimize these type of joins? This join is taking too much time for me(> 1 hr).
The tables T1,T2,T3 and T4 have 100k obs and Z has 400k obs approx.
If the ID's in T1-T4 are mutually exclusive (like they look in your example), I'd first concatenate those datasets:
data t1;
set t1-t4;
run;
and then sort/merge with Z:
proc sort data=T;
by id;
run;
proc sort data=Z;
by id;
run;
data want;
merge
Z (in=in_z)
T
;
by id;
if in_z;
run;
How is the relationship between those tables with regards to id? one-to-one, one-to-many, or many-to-many?
Depending on that, and the variables contained, sorting the tables by id and doing a data step merge might be the preferred solution.
Could you post some example data that shows the typical structures/contents (see my footnotes for converting datasets to data steps and posting the code). Just enough observations to clarify the issue.
The master table Z is a union of all the ID's present in tables T1/T2/T3 and T4. ID's present in all the 5 tables are disctinct.
Unfortunately the data belongs to my client and I cant post it. But here is some sample data for the same.
data T1;
input ID $20;
datalines;
1
2
3
;
run;
data T2;
input ID $20;
datalines;
4
5
6
;
run;
data T3;
input ID $20;
datalines;
7
8
9
;
run;
data T4;
input ID $20;
datalines;
10
11
12
;
run;
data Z;
input ID $20;
datalines;
1
2
3
4
5
6
7
8
9
10
11
12
;
run;
If the ID's in T1-T4 are mutually exclusive (like they look in your example), I'd first concatenate those datasets:
data t1;
set t1-t4;
run;
and then sort/merge with Z:
proc sort data=T;
by id;
run;
proc sort data=Z;
by id;
run;
data want;
merge
Z (in=in_z)
T
;
by id;
if in_z;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.