proc sql;
create table want as
select date, dea (dec.id=tap.id) as indic,
(dec.id ~=".") as decid,
(tap.id ~=".") as tapid,
coalesce (dec.id, tap.id) as id
from dec full join tap on dec.id=tap.id;
quit;
I have two datasets that I want to merge. dataset dec has 240000 obs, dataset tap has 14000 obs. They do have 13000 in common.
I want to merge these two. My above code can generate a table, but cannot keep all these columns that I need.
I can keep all columns and all obs, that is 24000.
I have 40 varibles, that I prefer to keep 15 variables of these 40 from dataset tap , and want to keep 5 varibles from dec. So it is not wise to type. Among them, some IDs are common.
Can anybody give advice?
Thanks.
If you use SQL, you have no option but to type each variable name, unless you just use *.
An option around this constraint is to use the keep= data set option to reduce the number of variables.
keep= allows you to use the colon, dash and double-dash list shortcuts such as VAR: or VAR1-VAR10 or VARA--ID .
If you use a data step merge instead of SQL, you have more flexibility in terms of implicitly naming variables, and you can monitor and count which observations for which table are kept or dropped.
Thanks. But the problem is, I want to keep COALESCE function, and I want to keep all these 240000 obs
if I just use a regular full join, I do not know how to add coalesce funtion,
when i use
on a.ID=b.ID, I am afriad that I lose obs
This reply is very confusing:
- the coalesce function is not used for the join
- you already use on a.id = b.id
Please provide an example with 5-record input tables and the expected output.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.