DATA Step, Macro, Functions and more

Proc SQL join with OR in where clause

Accepted Solution Solved
Reply
Occasional Contributor ADN
Occasional Contributor
Posts: 8
Accepted Solution

Proc SQL join with OR in where clause

Hi,

 

I have 4 tables T1,T2,T3 and T4 and also a master table Z. The condition I have to join the tables to the master table is 

 

T1.ID1=Z.ID OR T1.ID2=Z.ID OR T1.ID3=Z.ID OR T1.ID4=Z.ID

 

Is there any way to optimize these type of joins? This join is taking too much time for me(> 1 hr).

 

The tables T1,T2,T3 and T4 have 100k obs and Z has 400k obs approx.


Accepted Solutions
Solution
a month ago
Super User
Posts: 9,919

Re: Proc SQL join with OR in where clause

If the ID's in T1-T4 are mutually exclusive (like they look in your example), I'd first concatenate those datasets:

data t1;
set t1-t4;
run;

and then sort/merge with Z:

proc sort data=T;
by id;
run;

proc sort data=Z;
by id;
run;

data want;
merge
  Z (in=in_z)
  T
;
by id;
if in_z;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
Super User
Posts: 9,919

Re: Proc SQL join with OR in where clause

How is the relationship between those tables with regards to id? one-to-one, one-to-many, or many-to-many?

 

Depending on that, and the variables contained, sorting the tables by id and doing a data step merge might be the preferred solution.

 

Could you post some example data that shows the typical structures/contents (see my footnotes for converting datasets to data steps and posting the code). Just enough observations to clarify the issue.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Occasional Contributor ADN
Occasional Contributor
Posts: 8

Re: Proc SQL join with OR in where clause

Posted in reply to KurtBremser

The master table Z is a union of all the ID's present in tables T1/T2/T3 and T4. ID's present in all the 5 tables are disctinct.

 

Unfortunately the data belongs to my client and I cant post it. But here is some sample data for the same.

 

data T1;

input ID $20;

datalines;

1

2

3

;

run;

 

data T2;

input ID $20;

datalines;

4

5

6

;

run;

 

data T3;

input ID $20;

datalines;

7

8

9

;

run;

 

data T4;

input ID $20;

datalines;

10

11

12

;

run;

 

data Z;

input ID $20;

datalines;

1

2

3

4

5

6

7

8

9

10

11

12

;

run;

Solution
a month ago
Super User
Posts: 9,919

Re: Proc SQL join with OR in where clause

If the ID's in T1-T4 are mutually exclusive (like they look in your example), I'd first concatenate those datasets:

data t1;
set t1-t4;
run;

and then sort/merge with Z:

proc sort data=T;
by id;
run;

proc sort data=Z;
by id;
run;

data want;
merge
  Z (in=in_z)
  T
;
by id;
if in_z;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 82 views
  • 1 like
  • 2 in conversation