This is a repost since I didn't get the answer I was hoping for previously.
I am trying to merge 3 datasets together so that I only have unique records at the end.
I have provided the code and an example of how I would like the final table to look like in the end. I have highlighted the part I am having issues with. The first table is the "base" table, and what I want to do is to either merge the keys from the other 2 tables if there is a match, or add (or append) the ones that don't have a match.It is also important to keep the "left-most" name. So in the case below, I want to keep name 'ghi' since it is in the first table, and put the key for the 2nd table in the "want" table on that record since the names match, and I also want to have the key for the 3rd table since the keys matched between 'ghi' in the 2nd table and 'vwx' in the 3rd table. I also want to not keep the 'vwx' from the 3rd table since it found a match.
The additional layer I need is for the query to also not add 'pqr' to the final table since it matches 'have1' to 'have2' on 'key', and 'have2' to 'have3' on 'name', which is the part i can't get right.
If I didn't explain that well or if anyone has any additional questions, please feel free to reach out. Thank you and I appreciated the help in advance.
Thanks,
Tom
data have1;
input name1 $ 3. key1 4.;
datalines ;
abc 111
def 112
ghi 113
;
run;
data have2;
input name2 $ 3. key2 4.;
datalines ;
jkl 114
ghi 115
pqr 111
;
run;
data have3;
input name3 $ 3. key3 4.;
datalines ;
stu 117
vwx 115
pqr 118
;
run;
name | key1 | key2 | key3 |
abc | 111 | 111 | 118 |
def | 112 | ||
ghi | 113 | 115 | 115 |
jkl | 114 | ||
stu | 117 |
LIke this?
proc sql;
select coalescec(NAME1,NAME2,NAME3) as NAME
, KEY1
, KEY2
, KEY3
from HAVE1
full outer join
HAVE2
on NAME2=NAME1 or KEY2=KEY1
full outer join
HAVE3
on NAME3=NAME1 or KEY3=KEY1
or NAME3=NAME2 or KEY3=KEY2
order by NAME;
quit;
NAME | KEY1 | KEY2 | KEY3 |
---|---|---|---|
abc | 111 | 111 | 118 |
def | 112 | . | . |
ghi | 113 | 115 | 115 |
jkl | . | 114 | . |
stu | . | . | 117 |
LIke this?
proc sql;
select coalescec(NAME1,NAME2,NAME3) as NAME
, KEY1
, KEY2
, KEY3
from HAVE1
full outer join
HAVE2
on NAME2=NAME1 or KEY2=KEY1
full outer join
HAVE3
on NAME3=NAME1 or KEY3=KEY1
or NAME3=NAME2 or KEY3=KEY2
order by NAME;
quit;
NAME | KEY1 | KEY2 | KEY3 |
---|---|---|---|
abc | 111 | 111 | 118 |
def | 112 | . | . |
ghi | 113 | 115 | 115 |
jkl | . | 114 | . |
stu | . | . | 117 |
Thank you for the quick response, it was exactly what I needed.
If I wanted to change this problem up a little, would I be able to do a similar process using the example tables below? In essence, I believe it's doing a similar thing, but on multiple sets of keys instead of just one. My thoughts initially are that the null values might cause issues, or it could have issues since a key could match on multiple other keys...I appreciate any feedback in advance!
NEW TABLES:
data have1;
input name1 $ 3. key1 4. key2 4. key3 4.;
datalines ;
aaa 111 222 .
bbb 112 . 334
ccc 113 . .
;
run;
data have2;
input name2 $ 3. key1 4. key2 4. key3 4.;
datalines ;
ddd 111 222 333
eee . 223 334
fff . 224 335
;
run;
data have3;
input name3 $ 3. key1 4. key2 4. key3 4.;
datalines ;
ggg . 223 .
hhh 113 225 333
iii . . 336
;
run;
Want:
name | key1 | key2 | key3 |
aaa | 111 | 222 | 333 |
bbb | 112 | 223 | 334 |
ccc | 113 | 225 | 333 |
fff | 224 | 335 | |
iii | 336 |
I think I figured it out...but please anyone let me know if you see anything that could cause an issue..
proc sql;
create table want as
select coalescec(NAME1,NAME2,NAME3) as NAME
, coalesce(a.KEY1,b.key1,c.key1) as key1
, coalesce(a.KEY2,b.key2,c.key2) as key2
, coalesce(a.KEY3,b.key3,c.key3) as key3
from HAVE1 a
full outer join HAVE2 b on b.KEY1=a.KEY1 or b.KEY2=a.KEY2 or b.KEY3=a.KEY3
full outer join HAVE3 c on c.key1=coalesce(a.key1,b.key1,0) or c.key2=coalesce(a.key2,b.key2,0) or c.key3=coalesce(a.key3,b.key3,0)
order by NAME
;quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.