Hi everyone,
I'm trying to merge two datasets, ds1 and ds2, and create columns to indicate the source of each row in the output dataset, dsout.
For example,
data dsout;
merge ds1 (in=a)
ds2 (in=b);
by var;
from_ds1 = a;
from_ds2 = b;
run;
I'm wondering if there's a simple way to do that by proc sql
Thanks for your help.
You can approximate the same operation with:
data ds1;
input var var1 $;
datalines;
1 x
2 y
;
data ds2;
input var var2 $;
datalines;
2 J
3 N
;
proc sql;
select
*,
not missing(ds1.var) as inDs1,
not missing(ds2.var) as inDs2
from
ds1 natural full join ds2
quit;
sort of...
Good question. Proc sql joins work by joining based on values i.e m*n product for each by group however merge does by position of by groups record by record. Therefore in= options kinda thing wouldn't be relevant from a stand point of PDV vs product cartesian(for each by group)
Of course saying that, you could create a flag to keep track I suppose
data ds1;
input var var1 $;
datalines;
1 x
2 y
;
data ds2;
input var var2 $;
datalines;
2 J
3 N
;
data dsout;
merge ds1 (in=a)
ds2 (in=b);
by var;
length dataset $10.;
if a then dataset="ds1";
if b then dataset ="ds2";
if a and b then dataset="ds1+ds2";
run;
proc sql;
create table dsout_sql1 as
select coalesce(a.var, b.var) as var,
var1,
var2,
case when a.var =b.var then "ds1+ds2"
when a.var = . then "ds2"
when b.var = . then "ds1" end as dataset
from ds1 a
full join
ds2 b
on a.var = b.var;
You can approximate the same operation with:
data ds1;
input var var1 $;
datalines;
1 x
2 y
;
data ds2;
input var var2 $;
datalines;
2 J
3 N
;
proc sql;
select
*,
not missing(ds1.var) as inDs1,
not missing(ds2.var) as inDs2
from
ds1 natural full join ds2
quit;
sort of...
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.