BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DmytroYermak
Lapis Lazuli | Level 10

Could you please remind how to join several tables 

 

proc sql;
      select key.x1, bb.x1, cc.x2, dd.x3, ee.x4
from x1,x2,x3,x4
where key.x1=key.x2=key.x3=key.x4
;
quit;

 What is the correct statement: key.x1=key.x2=key.x3=key.x4 ?

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

@DmytroYermak wrote:

Thank you both for the detailed analysis. I wanted to put here my not fancy code but am seeing that have to investigatate yours that is working ).


As long as you are not trying to do many to many merge then it is much easier to code multiple dataset merges using normal SAS code than to trying to force into SQL syntax. 

For example you could merge 20 datasets.  List the variables you want to drop or keep. And best of all no ***** commas needed.

data want ;
  merge t1(in=in1) t2-t20 ;
  by x ;
  if in1;
  drop C D ;
run;

View solution in original post

15 REPLIES 15
Patrick
Opal | Level 21

@DmytroYermak

You need to refer to variables using <table name|table alias>.<variable name> so it should be x1.key and not key.x1

Assuming you want an inner join you need an AND between the conditions so something like:

x1.key=x2.key and x1.key=x3.key and x1.key=x4.key and x2.key=x3.key and x2.key=x4.key and x3.key=x4.key

 

DmytroYermak
Lapis Lazuli | Level 10
Thank you, Patrick. I actually thought about this solution but wanted to find anything shorter. That is, actually, the case.
DmytroYermak
Lapis Lazuli | Level 10
You are right, it is my mistake, definitely xn.key=..
PGStats
Opal | Level 21

Sometimes there is a way to make a SAS/SQL join without having to name the variables. If the keys in your tables are the only column names your tables have in common, you can request natural joins.

 

proc sql;

select *

from x1 natural join x2 natural join x3 natural join x4;

quit;

 

The joins will be performed on all shared column names.

PG
DmytroYermak
Lapis Lazuli | Level 10

I have to comment my question as I think that mislead you in someway.

I have the following tables:

6.jpg

I think that the optimal code is as mentioned here - https://communities.sas.com/t5/SAS-Data-Management/PROC-SQL-Joins-on-3-Tables/td-p/196808:

 

 

proc sql
  create table abc as
  select X
  from table1
  left join table2
  on table1.X = table2.X
  left join table3
  on table1.X = table3.X;
quit;

 

The question is how to eliminate columns C and D.

 

I understand that the solution was metioned above - to use just simple 'select...from' statement with the following 'where':

table1.x=table2.x and table1.x=table2.x and table1.x=table3.x. But would it be possible to use join and eliminate the appropriate columns C and D as in real task the number of tables is quite significant as well as the number of columns to be selected?

PGStats
Opal | Level 21

select table1.x, table2.b, table3.d

from ...

PG
DmytroYermak
Lapis Lazuli | Level 10
Could you please write the 'where' statement as well?
Patrick
Opal | Level 21

@DmytroYermak


@DmytroYermak wrote:
Could you please write the 'where' statement as well?

The rest of the SQL code you've posted was fine for the example you've posted. Just combine the SELECT bit as posted by @PGStats with the other parts you've posted already.

Tom
Super User Tom
Super User

@DmytroYermak wrote:

I have to comment my question as I think that mislead you in someway.

I have the following tables:

6.jpg

I think that the optimal code is as mentioned here - https://communities.sas.com/t5/SAS-Data-Management/PROC-SQL-Joins-on-3-Tables/td-p/196808:

 

 

proc sql
  create table abc as
  select X
  from table1
  left join table2
  on table1.X = table2.X
  left join table3
  on table1.X = table3.X;
quit;

 

The question is how to eliminate columns C and D.

 

I understand that the solution was metioned above - to use just simple 'select...from' statement with the following 'where':

table1.x=table2.x and table1.x=table2.x and table1.x=table3.x. But would it be possible to use join and eliminate the appropriate columns C and D as in real task the number of tables is quite significant as well as the number of columns to be selected?


Sometimes it is easier to take advantage of SAS dataset options.  

For example if you want everything but C and D then you drop them on the way out.

proc sql ;
create table abc(drop=C D) as
  select *
  from table1
  natural left join table2
  natural left join table3
;
quit;

Or drop them on the way in.

proc sql ;
create table abc as
  select *
  from table1
  natural left join table2(drop=C)
  natural left join table3(drop=D)
;
quit;
PGStats
Opal | Level 21

Hi @Tom, I also thought of this solution, but natural left joins are somewhat counter intuitive. There seems to be no way to force

 

select x, b, d
from t1 natural left join t2 natural left join t3;

 

to be like

 

select t1.x, b, d
from t1 left join t2 on t1.x=t2.x left join t3 on t1.x=t3.x;

 

it keeps being interpreted as

 

select t1.x, b, d
from t1 left join t2 on t1.x=t2.x left join t3 on t2.x=t3.x;

 

even with parentheses, which means that t1 cannot be treated as the key reference for both joins.

 

One way around this with natural joins is

 

data t1;

infile datalines missover;

input x a$;

datalines;

1

2 a

3 a

4 a

5 a

6

;

data t2;

infile datalines missover;

input x b$ c$;

datalines;

1 b c

3 b c

5 b c

;

data t3;

infile datalines missover;

input x d$ e$;

datalines;

2 d e

4 d e

6 d e

;

proc sql;

select x, b, d

from

( select *

from t1 natural left join t2 )

natural join

( select *

from t1 natural left join t3 )

;

quit;

 

PG
Tom
Super User Tom
Super User

@PGStats

Good to know about how SAS performs the NATURAL joins.

 

In this case you can nest the queries to get SAS to do the same thing as it would in a normal merge.

Note that if you want the SQL process to put the variables in the same order as a normal merge would then you need to reference the tables in SQL query in the reverse order. 

data want ;
  merge t1(in=in1) t2 t3;
  by x ;
  if in1;
run;

proc sql ;
  create table test2 as
  select * 
  from (select * from t3 natural full join t2 ) c1
  natural right join t1
  ;
quit;
proc print data=want;
run;
proc print data=test2;
run;
proc compare data=want compare=test2 ; run;
Obs    x    a    b    c    d    e

 1     1         b    c
 2     2    a              d    e
 3     3    a    b    c
 4     4    a              d    e
 5     5    a    b    c
 6     6                   d    e


Obs    x    a    b    c    d    e

 1     1         b    c
 2     2    a              d    e
 3     3    a    b    c
 4     4    a              d    e
 5     5    a    b    c
 6     6                   d    e



The COMPARE Procedure
Comparison of WORK.WANT with WORK.TEST2
(Method=EXACT)

NOTE: No unequal values were found. All values compared are exactly equal.
DmytroYermak
Lapis Lazuli | Level 10

Thank you both for the detailed analysis. I wanted to put here my not fancy code but am seeing that have to investigatate yours that is working ).

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 3000 views
  • 0 likes
  • 6 in conversation