BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
skcussas
Fluorite | Level 6

Hi everyone,

 

I'm trying to merge two datasets, ds1 and ds2, and create columns to indicate the source of each row in the output dataset, dsout.

For example,

 

data dsout;

    merge   ds1 (in=a)

            ds2 (in=b);

    by var;

    from_ds1 = a;

    from_ds2 = b;

run;

 

I'm wondering if there's a simple way to do that by proc sql

 

Thanks for your help.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

You can approximate the same operation with:

 

data ds1;
 input var var1 $;
 datalines;
 1   x
 2   y
 ;
 
 data ds2;
 input var var2 $;
 datalines;
 2   J
 3   N
 ;

proc sql;
select 
    *, 
    not missing(ds1.var) as inDs1,
    not missing(ds2.var) as inDs2
from
ds1 natural full join ds2
quit;

sort of...

PG

View solution in original post

3 REPLIES 3
novinosrin
Tourmaline | Level 20

Good question. Proc sql joins work by joining based on values i.e m*n product for each by group however merge does by position of by groups record by record. Therefore in= options kinda thing wouldn't be relevant from a stand point of PDV vs product cartesian(for each by group)

 

Of course saying that, you could create a flag to keep track I suppose

 

kiranv_
Rhodochrosite | Level 12

 


 data ds1;
 input var var1 $;
 datalines;
 1   x
 2   y
 ;
 
  data ds2;
 input var var2 $;
 datalines;
 2   J
 3   N
 ;
 
 
 data dsout;

    merge   ds1 (in=a)

            ds2 (in=b);

    by var;
length dataset $10.;
 if a then dataset="ds1";
 if b then dataset ="ds2";
 if a and b then dataset="ds1+ds2";
 
run;


proc sql;
create table dsout_sql1 as 
select coalesce(a.var, b.var) as var,
        var1,
        var2,
       case when a.var =b.var then "ds1+ds2"
        when a.var = . then "ds2"
        when b.var = . then "ds1" end as dataset
 from ds1 a
 full join
 ds2 b
 on a.var = b.var;
        




PGStats
Opal | Level 21

You can approximate the same operation with:

 

data ds1;
 input var var1 $;
 datalines;
 1   x
 2   y
 ;
 
 data ds2;
 input var var2 $;
 datalines;
 2   J
 3   N
 ;

proc sql;
select 
    *, 
    not missing(ds1.var) as inDs1,
    not missing(ds2.var) as inDs2
from
ds1 natural full join ds2
quit;

sort of...

PG

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 815 views
  • 0 likes
  • 4 in conversation