BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
skcussas
Fluorite | Level 6

Hi everyone,

 

I'm trying to merge two datasets, ds1 and ds2, and create columns to indicate the source of each row in the output dataset, dsout.

For example,

 

data dsout;

    merge   ds1 (in=a)

            ds2 (in=b);

    by var;

    from_ds1 = a;

    from_ds2 = b;

run;

 

I'm wondering if there's a simple way to do that by proc sql

 

Thanks for your help.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

You can approximate the same operation with:

 

data ds1;
 input var var1 $;
 datalines;
 1   x
 2   y
 ;
 
 data ds2;
 input var var2 $;
 datalines;
 2   J
 3   N
 ;

proc sql;
select 
    *, 
    not missing(ds1.var) as inDs1,
    not missing(ds2.var) as inDs2
from
ds1 natural full join ds2
quit;

sort of...

PG

View solution in original post

3 REPLIES 3
novinosrin
Tourmaline | Level 20

Good question. Proc sql joins work by joining based on values i.e m*n product for each by group however merge does by position of by groups record by record. Therefore in= options kinda thing wouldn't be relevant from a stand point of PDV vs product cartesian(for each by group)

 

Of course saying that, you could create a flag to keep track I suppose

 

kiranv_
Rhodochrosite | Level 12

 


 data ds1;
 input var var1 $;
 datalines;
 1   x
 2   y
 ;
 
  data ds2;
 input var var2 $;
 datalines;
 2   J
 3   N
 ;
 
 
 data dsout;

    merge   ds1 (in=a)

            ds2 (in=b);

    by var;
length dataset $10.;
 if a then dataset="ds1";
 if b then dataset ="ds2";
 if a and b then dataset="ds1+ds2";
 
run;


proc sql;
create table dsout_sql1 as 
select coalesce(a.var, b.var) as var,
        var1,
        var2,
       case when a.var =b.var then "ds1+ds2"
        when a.var = . then "ds2"
        when b.var = . then "ds1" end as dataset
 from ds1 a
 full join
 ds2 b
 on a.var = b.var;
        




PGStats
Opal | Level 21

You can approximate the same operation with:

 

data ds1;
 input var var1 $;
 datalines;
 1   x
 2   y
 ;
 
 data ds2;
 input var var2 $;
 datalines;
 2   J
 3   N
 ;

proc sql;
select 
    *, 
    not missing(ds1.var) as inDs1,
    not missing(ds2.var) as inDs2
from
ds1 natural full join ds2
quit;

sort of...

PG
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1552 views
  • 0 likes
  • 4 in conversation