BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Toni2
Lapis Lazuli | Level 10

i have 2 datasets :

 

A 1 p
B 2 p
C 1 p
D 1 p

 

and :

 

B 2 s
D 1 s

 

and want to merge to this : 

 

A 1 p
B 2 s
C 1 p
D 1 s

 

this is the V1 is always populated with values from both tables A and B

 

data a;
input ID1 $ ID2 V1 $;
 datalines;
A 1 p 
B 2 p
C 1 p
D 1 p
;
run;

data b;
input ID1 $ ID2 V1 $;
 datalines;
B 2 s
D 1 s
;
run;

proc sql; create table c as select x.ID1, x.ID2, y.v1 from A as x left join B as y on x.ID1 =y.ID1 and x.ID2 =y.ID2; quit;
1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data a;
input ID1 $ ID2 V1 $;
 datalines;
A 1 p 
B 2 p
C 1 p
D 1 p
;
run;

data b;
input ID1 $ ID2 V1 $;
 datalines;
B 2 s
D 1 s
;
run;
proc sql;
create table c  as
 select
 x.ID1,
 x.ID2,
 coalescec(y.v1,x.v1) as v1
  from A  as x left join B as y
on x.ID1 =y.ID1 and x.ID2 =y.ID2; 
quit;

View solution in original post

4 REPLIES 4
Ksharp
Super User
data a;
input ID1 $ ID2 V1 $;
 datalines;
A 1 p 
B 2 p
C 1 p
D 1 p
;
run;

data b;
input ID1 $ ID2 V1 $;
 datalines;
B 2 s
D 1 s
;
run;
proc sql;
create table c  as
 select
 x.ID1,
 x.ID2,
 coalescec(y.v1,x.v1) as v1
  from A  as x left join B as y
on x.ID1 =y.ID1 and x.ID2 =y.ID2; 
quit;
Patrick
Opal | Level 21

Thank you for providing all the sample data as fully working data steps, providing the SQL you've already got and also showing us the desired result. That made it really simple to "fill in" the last little gap which was that you didn't actually formulate your question.

What you're looking for is the coalesce() function which will pick the first non-missing value.

data a;
  input ID1 $ ID2 V1 $;
  datalines;
A 1 p 
B 2 p
C 1 p
D 1 p
;
run;

data b;
  input ID1 $ ID2 V1 $;
  datalines;
B 2 s
D 1 s
;
run;

proc sql;
  create table c  as
    select
      x.ID1,
      x.ID2,
      coalesce(y.v1,x.v1) as v1
    from A  as x left join B as y
      on x.ID1 =y.ID1 and x.ID2 =y.ID2;
quit;

 

Tom
Super User Tom
Super User

For the posted code you could just use normal SAS code instead of SQL.  The only reason to be forced to have to jury rig something using SQL would be if you wanted to do a many to many join.

 

If you want the values from B to overwrite the values from A then use a MERGE.

data want;
  merge a b;
  by id1;
run;

If you only want the non-missing values from B to override the values from A then use UPDATE instead.

data want;
  update a b;
  by id1;
run;
ballardw
Super User

Generic comment: If there are no duplicates of the ID and ID2 combination in the first data set then this is actually an UPDATE, replacing a value on matching values.

 

data c;
  update a b;
  by id1 id2;
run;

Though the data step UPDATE will require sets A and B to be sorted by the By variables if not actually in that order.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 794 views
  • 4 likes
  • 5 in conversation