BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
trevand
Obsidian | Level 7

I would like to have a variable that tells me which observations have merged and which have not. I would like to create this in the merge step. I can do it when I use merge within the data step. I can't however figure out how I would do it with a proc sql merge. Could anyone help me please? Here is the code that creates this variable in the data step.

 

data data_merged;
merge data_1 (in=a) data_2 (in=b);
by merge_id;
data1=a; 
data2=b;
     if data1=1 & data2=1 then merge=3;
else if data1=0 & data2=1 then merge=2;
else if data1=1 & data2=0 then merge=1;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Try

proc sql;
create table joined as
  select
    /* do NOT use asterisks, it causes a WARNING */
    /* use an exhaustive list of variables with suitable aliases instead */
    , case
      when not missing(t1.i) and not missing(t2.id) then 3
      when missing(t1.id) and not missing(t2.id) then 2
      when not missing(t1.id) and missing(t2.id) then 1
else 0 end as indicator from data_1 t1 full join data_2 t2 on t1.id = t2.id ; quit;

I specifically make no reference to MERGE, as this is a SQL join, which is different from a DATA step MERGE.

View solution in original post

6 REPLIES 6
ballardw
Super User

You should share your SQL JOIN (not merge, not the same thing) code that you want to apply this to.

 

you can add variable to any data set at join time by using something like

 

From (select *, 1 as in_a from somedataset) as a

 

trevand
Obsidian | Level 7

@ballardwthis is the code I am using for the proc sql merge. How would you add the merge variable?

 

proc sql;
create table merge_data as
select t1.*, t2.*
from data_1 t1
left join data_2 t2 on t1.id=t2.id;
ballardw
Super User
proc sql;
create table merge_data as
select t1.*, t2.*
from (select *, 1 as data1 from data_1 ) t1
left join (select *, 1 as data2 from data_2) t2 on t1.id=t2.id;

However with a LEFT join there the variable Data1 will always be 1.

 

Note that the outer Select, where you have select t1.*, t2.* doesn't need to actually keep data1 and data2 but the variables can be used in a Case clause to set the "merge" status you want.

 


@trevand wrote:

@ballardwthis is the code I am using for the proc sql merge. How would you add the merge variable?

 

proc sql;
create table merge_data as
select t1.*, t2.*
from data_1 t1
left join data_2 t2 on t1.id=t2.id;

 

trevand
Obsidian | Level 7

@ballardwhow would you add the final merge variable to the proc sql code?

 

     if data1=1 & data2=1 then merge=3;
else if data1=0 & data2=1 then merge=2;
else if data1=1 & data2=0 then merge=1;
ballardw
Super User

With a Left or Right join one of those conditions NEVER appears. With a left join you can never have an observations from the "right" data set without a matching "left". Same with a right joint only the order of which can't appear reverses.

 

LEFT join (on some conditios) explicitly says : Return all the observations from the "left" set and the matches that occur in the right set when present.

Kurt_Bremser
Super User

Try

proc sql;
create table joined as
  select
    /* do NOT use asterisks, it causes a WARNING */
    /* use an exhaustive list of variables with suitable aliases instead */
    , case
      when not missing(t1.i) and not missing(t2.id) then 3
      when missing(t1.id) and not missing(t2.id) then 2
      when not missing(t1.id) and missing(t2.id) then 1
else 0 end as indicator from data_1 t1 full join data_2 t2 on t1.id = t2.id ; quit;

I specifically make no reference to MERGE, as this is a SQL join, which is different from a DATA step MERGE.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 700 views
  • 1 like
  • 3 in conversation