I would like to merge the below two datasets with id, type, type2 and code and If I don’t find a match I would like to take the value from the second dataset were the keys are missing and assign to the first datasets.
Here is the desired output.
Dataset-1
data t3;
input id code $ type $ type2 $ price;
datalines;
101 AAA L80 GT80 50.0
101 AAB L60 LT20 20.2
101 AAC L20 LE60 30.5
101 ABA L80 EQ20 40.2
102 AAA L80 GT80 50.2
102 AAB L60 LT20 51.0
102 AAC L20 LE60 30.2
102 ABA L80 GT20 20.5
;
run;
dataset-2
data t4;
input id code $ type $ type2 $ new_price;
datalines;
101 . . . 25.5
101 AAA L80 GT80 22.5
101 AAA L20 GT80 1.0
102 . . . 32.5
102 AAC L20 LE60 18.5
102 AAC L20 LE60 12.0
102 ACC L80 GT30 11.0
;
run;
Desired output;
id |
code |
type |
type2 |
price |
new_price |
101 |
AAA |
L80 |
GT80 |
50 |
22.5 |
101 |
AAB |
L60 |
LT20 |
20.2 |
25.5 |
101 |
AAC |
L20 |
LE60 |
30.5 |
25.5 |
101 |
ABA |
L80 |
EQ20 |
40.2 |
25.5 |
102 |
AAA |
L80 |
GT80 |
50.2 |
32.5 |
102 |
AAB |
L60 |
LT20 |
51 |
32.5 |
102 |
AAC |
L20 |
LE60 |
30.2 |
12 |
102 |
ABA |
L80 |
GT20 |
20.5 |
32.5 |
Does below return what you're after?
data t3;
input id code $ type $ type2 $ price;
datalines;
101 AAA L80 GT80 50.0
101 AAB L60 LT20 20.2
101 AAC L20 LE60 30.5
101 ABA L80 EQ20 40.2
102 AAA L80 GT80 50.2
102 AAB L60 LT20 51.0
102 AAC L20 LE60 30.2
102 ABA L80 GT20 20.5
;
run;
data t4;
input id code $ type $ type2 $ new_price;
datalines;
101 . . . 25.5
101 AAA L80 GT80 22.5
101 AAA L20 GT80 1.0
102 . . . 32.5
102 AAC L20 LE40 18.5
102 AAC L20 LE60 12.0
102 ACC L80 GT30 11.0
;
run;
proc sql;
create table want as
select
l.*,
coalesce(r.new_price, r2.new_price) as new_price
from
t3 as l
left join
t4 as r
on l.id=r.id and l.code=r.code and l.type=r.type and l.type2=r.type2
left join
t4 as r2
on l.id=r2.id and missing(r2.code) and missing(r2.type) and missing(r2.type2)
;
quit;
There was a duplicate key in your t4 table. I've made the assumption that this is a typo and changed one of the key values as below to avoid issues.
102 AAC L20 LE40 18.5
102 AAC L20 LE60 12.0
Proc sort data=t3; by id code type type2; run;
Proc sort data=t4; by id code type type2; run;
data temp1 temp2;
merge t3 (in=in3) t4 (in=in4);
by id code type type2;
if in3;
if in3 and in4 then output temp1;
else output temp2;
run;
data temp2_new;
merge temp2 t4(where=(code=' ') keep=id new_price);
by id;
run;
data want;
merge temp1 temp2_new;
by id code type type2;
run;
I agree with Shmuel
Does below return what you're after?
data t3;
input id code $ type $ type2 $ price;
datalines;
101 AAA L80 GT80 50.0
101 AAB L60 LT20 20.2
101 AAC L20 LE60 30.5
101 ABA L80 EQ20 40.2
102 AAA L80 GT80 50.2
102 AAB L60 LT20 51.0
102 AAC L20 LE60 30.2
102 ABA L80 GT20 20.5
;
run;
data t4;
input id code $ type $ type2 $ new_price;
datalines;
101 . . . 25.5
101 AAA L80 GT80 22.5
101 AAA L20 GT80 1.0
102 . . . 32.5
102 AAC L20 LE40 18.5
102 AAC L20 LE60 12.0
102 ACC L80 GT30 11.0
;
run;
proc sql;
create table want as
select
l.*,
coalesce(r.new_price, r2.new_price) as new_price
from
t3 as l
left join
t4 as r
on l.id=r.id and l.code=r.code and l.type=r.type and l.type2=r.type2
left join
t4 as r2
on l.id=r2.id and missing(r2.code) and missing(r2.type) and missing(r2.type2)
;
quit;
There was a duplicate key in your t4 table. I've made the assumption that this is a typo and changed one of the key values as below to avoid issues.
102 AAC L20 LE40 18.5
102 AAC L20 LE60 12.0
Thanks for your help
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.