BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ADN
Fluorite | Level 6 ADN
Fluorite | Level 6

Hi,

 

I have 4 tables T1,T2,T3 and T4 and also a master table Z. The condition I have to join the tables to the master table is 

 

T1.ID1=Z.ID OR T1.ID2=Z.ID OR T1.ID3=Z.ID OR T1.ID4=Z.ID

 

Is there any way to optimize these type of joins? This join is taking too much time for me(> 1 hr).

 

The tables T1,T2,T3 and T4 have 100k obs and Z has 400k obs approx.

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

If the ID's in T1-T4 are mutually exclusive (like they look in your example), I'd first concatenate those datasets:

data t1;
set t1-t4;
run;

and then sort/merge with Z:

proc sort data=T;
by id;
run;

proc sort data=Z;
by id;
run;

data want;
merge
  Z (in=in_z)
  T
;
by id;
if in_z;
run;

View solution in original post

3 REPLIES 3
Kurt_Bremser
Super User

How is the relationship between those tables with regards to id? one-to-one, one-to-many, or many-to-many?

 

Depending on that, and the variables contained, sorting the tables by id and doing a data step merge might be the preferred solution.

 

Could you post some example data that shows the typical structures/contents (see my footnotes for converting datasets to data steps and posting the code). Just enough observations to clarify the issue.

ADN
Fluorite | Level 6 ADN
Fluorite | Level 6

The master table Z is a union of all the ID's present in tables T1/T2/T3 and T4. ID's present in all the 5 tables are disctinct.

 

Unfortunately the data belongs to my client and I cant post it. But here is some sample data for the same.

 

data T1;

input ID $20;

datalines;

1

2

3

;

run;

 

data T2;

input ID $20;

datalines;

4

5

6

;

run;

 

data T3;

input ID $20;

datalines;

7

8

9

;

run;

 

data T4;

input ID $20;

datalines;

10

11

12

;

run;

 

data Z;

input ID $20;

datalines;

1

2

3

4

5

6

7

8

9

10

11

12

;

run;

Kurt_Bremser
Super User

If the ID's in T1-T4 are mutually exclusive (like they look in your example), I'd first concatenate those datasets:

data t1;
set t1-t4;
run;

and then sort/merge with Z:

proc sort data=T;
by id;
run;

proc sort data=Z;
by id;
run;

data want;
merge
  Z (in=in_z)
  T
;
by id;
if in_z;
run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 2717 views
  • 1 like
  • 2 in conversation