BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
zihdonv19
Obsidian | Level 7

I have several datasets that I want to use proc sql to join. but here to simplify, I use 3 datasets as example:

 

data dat1;
input id x1;
cards;
1 1
2 0
3 0
;
run;

data dat2;
input id x2;
cards;
2 1 
3 1 
5 0 
;
run;

data dat3;
input id x3;
cards;
3 0 
4 1 
6 0 
;
run;

I want to merge them based one id, with one id per row, the output should be something like this:

id x1 x2 x3
1 1 . .
2 0 1 .
3 0 1 0
4 . . 1
5 . 0 .
6 . . 0
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Why would you use SQL for that?  It is trivial with normal SAS code.

data want;
  merge dat1-dat3;
  by id;
run;

Result:

Obs    id    x1    x2    x3

 1      1     1     .     .
 2      2     0     1     .
 3      3     0     1     0
 4      4     .     .     1
 5      5     .     0     .
 6      6     .     .     0

If you did need to do it in PROC SQL code then the easiest is to use NATURAL joins.

proc sql;
create table want as
select * 
from dat2 
natural full join dat3
natural full join dat1 
order by id
;
quit;

(I referenced the datasets in that strange order so that X1 to X3 get added to the dataset in a nicer order so I could use * as the variable list.)

 

If you don't use NATURAL joins you will have to use the COALESCE() function to get the resulting ID variable and spell out the join criteria.

proc sql;
create table want as
select coalesce(a.id,b.id,c.id) as id,x1,x2,x3 
from dat1 a
full join dat2 b on a.id=b.id
full join dat3 c on a.id=c.id or b.id=c.id 
order by 1
;
quit;

View solution in original post

3 REPLIES 3
SASJedi
SAS Super FREQ

Your data is already sorted by ID, so this will do the trick:

data want;
	merge dat1 dat2 dat3;
	by id;
run;
Check out my Jedi SAS Tricks for SAS Users
ballardw
Super User

A very important question to ask is are the values of ID duplicated in any of the sets? If so, how do you want the result to appear?

 

If your data is sorted (as implied by the example data sets) by ID and none of the ID are duplicated within a single set then a Data step merge is going to be way easier:

 

data want;
   merge dat1 dat2 dat3;
   by id;
run;

The question about duplicates is very important because SQL joins are likely to result in multiple output rows.

An example

data dat1;
input id x1;
cards;
1 1
2 0
2 1
;
run;

data dat2;
input id x2;
cards;
2 1 
2 3 
5 0 
;
run;

proc sql;
   create table example as
   select a.id,a.x1,b.x2
   from dat1 as a
        left join 
        dat2 as b
        on a.id=b.id
   ;
quit;

Note that the two values from Id=2 are combined with both Id=2 in the other set resulting in 4 id=2 observations (and yes, this example does get the Id=5 from set 2 as that is another headache in SQL)

Tom
Super User Tom
Super User

Why would you use SQL for that?  It is trivial with normal SAS code.

data want;
  merge dat1-dat3;
  by id;
run;

Result:

Obs    id    x1    x2    x3

 1      1     1     .     .
 2      2     0     1     .
 3      3     0     1     0
 4      4     .     .     1
 5      5     .     0     .
 6      6     .     .     0

If you did need to do it in PROC SQL code then the easiest is to use NATURAL joins.

proc sql;
create table want as
select * 
from dat2 
natural full join dat3
natural full join dat1 
order by id
;
quit;

(I referenced the datasets in that strange order so that X1 to X3 get added to the dataset in a nicer order so I could use * as the variable list.)

 

If you don't use NATURAL joins you will have to use the COALESCE() function to get the resulting ID variable and spell out the join criteria.

proc sql;
create table want as
select coalesce(a.id,b.id,c.id) as id,x1,x2,x3 
from dat1 a
full join dat2 b on a.id=b.id
full join dat3 c on a.id=c.id or b.id=c.id 
order by 1
;
quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 194 views
  • 6 likes
  • 4 in conversation