Solved: Use PROC SQL to merge more than two datasets by one common variable

ANKH1 · Posted 10-25-2022 09:36 PM

Hi,

I would like to merge more than two datasets using PROC SQL. I want to merge them by ID (all datasets have this variable in common). These datasets (around 10) all have different number of columns. I've only found a way to join two datasets (example below).

PROC SQL;
SELECT A.*, B.*
FROM STATES AS A, CITYS AS B
WHERE A.ID=B.ID;

Tom · Posted 10-25-2022 11:45 PM

@ANKH1 wrote:

Hi,

I would like to merge more than two datasets using PROC SQL. I want to merge them by ID (all datasets have this variable in common). These datasets (around 10) all have different number of columns. I've only found a way to join two datasets (example below).

PROC SQL;
SELECT A.*, B.*
FROM STATES AS A, CITYS AS B
WHERE A.ID=B.ID;

Why PROC SQL?

If you want to merge multiple data sets it is much easier in SAS syntax instead.

data want;
  merge one two three ;
  by id;
run;

If you want to "join" in SQL then probably should be explicit about the type of join you want to do.

So assuming you only want that observations that have data in all three dataset then use INNER join.

proc sql;
create table want as
select *
from one a 
inner join two b 
  on a.id = b.id
inner join three c
  on a.id = c.id
;
quit;

But you probably need to also be careful about which variables you select. Using the * shortcut to select all variables will generate notes that ID already exists in the dataset since it will include A.ID and B.ID and C.ID. Since the dataset WANT can only have one variable named ID the first one will be the values that are kept. With an INNER join it does not matter since you are only selecting the joins where the values of the ID variable are the same.

View solution in original post

Tom · Posted 10-25-2022 11:45 PM

@ANKH1 wrote:

Hi,

I would like to merge more than two datasets using PROC SQL. I want to merge them by ID (all datasets have this variable in common). These datasets (around 10) all have different number of columns. I've only found a way to join two datasets (example below).

PROC SQL;
SELECT A.*, B.*
FROM STATES AS A, CITYS AS B
WHERE A.ID=B.ID;

Why PROC SQL?

If you want to merge multiple data sets it is much easier in SAS syntax instead.

data want;
  merge one two three ;
  by id;
run;

If you want to "join" in SQL then probably should be explicit about the type of join you want to do.

So assuming you only want that observations that have data in all three dataset then use INNER join.

proc sql;
create table want as
select *
from one a 
inner join two b 
  on a.id = b.id
inner join three c
  on a.id = c.id
;
quit;

But you probably need to also be careful about which variables you select. Using the * shortcut to select all variables will generate notes that ID already exists in the dataset since it will include A.ID and B.ID and C.ID. Since the dataset WANT can only have one variable named ID the first one will be the values that are kept. With an INNER join it does not matter since you are only selecting the joins where the values of the ID variable are the same.

ANKH1 · Posted 10-30-2022 06:22 PM

Thank you! I used the first solution you mentioned.

Ksharp · Posted 10-26-2022 08:26 AM

proc sql;
create table want as
select *
from
one
natural join
two
natural join
three
;
quit;

himself · Posted 10-31-2022 05:22 AM

Hi,

There is a link with a similar questions, the examples shown here, looks like what you want,

Merge-Multiple-tables

Hope it will be useful

ANKH1 · Posted 10-31-2022 03:24 PM

Thanks! The link is not working for some reason.

Use PROC SQL to merge more than two datasets by one common variable

Re: Use PROC SQL to merge more than two datasets by one common variable

Re: Use PROC SQL to merge more than two datasets by one common variable

Re: Use PROC SQL to merge more than two datasets by one common variable

Re: Use PROC SQL to merge more than two datasets by one common variable

Re: Use PROC SQL to merge more than two datasets by one common variable

Re: Use PROC SQL to merge more than two datasets by one common variable

Catch up on SAS Innovate 2026

Use PROC SQL to merge more than two datasets by one common variable

Re: Use PROC SQL to merge more than two datasets by one common variable

Re: Use PROC SQL to merge more than two datasets by one common variable

Re: Use PROC SQL to merge more than two datasets by one common variable

Re: Use PROC SQL to merge more than two datasets by one common variable

Re: Use PROC SQL to merge more than two datasets by one common variable

Re: Use PROC SQL to merge more than two datasets by one common variable

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away