BookmarkSubscribeRSS Feed
Mathis1
Quartz | Level 8

Hi, i would like to merge two tables by two variables A1 and A2, but keep the observations that only have A1 in common in the defenitive table. It would look like this : 

 

Table 1     
A1 A2 Y1
A 1 X
A 2 X
A 3 X
A 4 X
B 1 X
C 1 X
C 2 X
D 1 X
D 2 X

 

Table 2    
A1 A2 Y2
A 1 Y
A 5 Y
B 2 Y
C 1 Y
C 2 Y
D 1 Y
D 3 Y
E 3 Y
E 4 Y
F 2 Y
F 1 Y
G 2 Y
G 3 Y

 

 

The definitive table would looke like it :


Have       
A1 A2 Y1 Y2
A 1 X Y
A 2 X .
A 3 X .
A 4 X .
A 5 . Y
B 1 X  

 

B

2 . Y
C 1 X Y
C 2 X Y
D 1 X  
D 2 X  
D 3 . Y

 

Thank you for your help 😄

 

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

Why is Y2 missing in the first obs of the desired output?

Mathis1
Quartz | Level 8

My bad it should not be missing, i just modified it.

PeterClemmensen
Tourmaline | Level 20

Ok. If your actual data is sorted, a simple Merge Statement is enough?

 

I know that your sample data is not (but almost though so I have a feeling your actual data may be)

 

data Table1;   
input A1 $ A2 Y1 $;
datalines;
A 1 X
A 2 X
A 3 X
A 4 X
B 1 X
C 1 X
C 2 X
D 1 X
D 2 X
;

data Table2;   
input A1 $ A2 Y2 $;
datalines;
A 1 Y
A 5 Y
B 2 Y
C 1 Y
C 2 Y
D 1 Y
D 3 Y
E 3 Y
E 4 Y
F 1 Y
F 2 Y
G 2 Y
G 3 Y
;

data want;
   merge Table1 Table2;
   by A1 A2;
run;
yabwon
Onyx | Level 15

Isn't the `in=` operator missing?

data want;
   merge Table1(in = t1) Table2;
   by A1 A2;
   if t1;
run;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Kurt_Bremser
Super User

First we need a lookup table that identifies the a1 groups we want to keep:

data l;
merge
  table1 (in=t1 keep=a1)
  table2 (in=t2 keep=a1)
;
by a1;
if t1 and t2;
run;

Next, merge the tables, and select those groups in the lookup:

data want;
merge
  table1
  table2
;
by a1 a2;
if _n_ = 1
then do;
  declare hash l (dataset:"l");
  l.definekey("a1");
  l.definedone();
end;
if l.check() = 0;
run;
hhinohar
Quartz | Level 8
*sql;
proc sql;
create table want1 as
select * from
  (select a.a1,
          a.a2,
          a.y1,
          b.y2 
            from table1 a left join table2 b 
              on a.a1=b.a1 and a.a2=b.a2
  union 
  select b.a1,
         b.a2,
         a.y1,
         b.y2
           from table1 a right join table2 b 
             on a.a1=b.a1 and a.a2=b.a2)
    where a1 in (select distinct a1 from table1);
  ;
quit;

*hash;
data want2;
  length a1 $8 a2 8 y1 y2 $8;
  if _N_=1 then do;
      if 0 then set table2;
      dcl hash h(dataset:"table2",multidata:"y") 
                x(multidata:"y") 
                a(multidata:"y",ordered:"y") 
                u();
      h.definekey("a1","a2");
      h.definedata("a1","a2","y2");
      h.definedone();
      
      x.definekey("a1","a2");
      x.definedata("a1","a2","y1");
      x.definedone();
      
      a.definekey("a1","a2");
      a.definedata("a1","a2","y1","y2");
      a.definedone();
      
      u.definekey("a1");
      u.definedone();

      dcl hiter hi("h") ai("a");		
  end;
  do until(last);
      set table1 end=last;
      x.add();
      _iorc_=u.add();
      _iorc_=h.find();
      if _iorc_ ne 0 then call missing(y2);
      a.replace();
  end;
  if last;
  do while(hi.next()=0);
      _iorc_=x.find();
      if _iorc_ ne 0 then call missing(y1);
      a.replace();
  end;
  if last;
  do while(ai.next()=0);
      if u.find()=0 then output;
  end;
  
  h.clear();
  x.clear();
  a.clear();
  u.clear();
run;
RichardDeVen
Barite | Level 11

Based on strictly on what is stated I infer that the edge case of two groups of rows with a common A1 but completely disjoint in A2 would be in the result set.

 

Proc SQL can use a group COUNT to determine when a join has rows from both left and right tables.  COALESCE must be used ensure the join key is propagated into the result set.

 

Example:

Added A1='Z' to LEFT with no such A1 in RIGHT, showing, that according to 'rules' Z should not be in result.

 

data Table1;   
input A1 $ A2 Y1 $;
datalines;
A 1 X
A 2 X
A 3 X
A 4 X
B 1 X
C 1 X
C 2 X
D 1 X
D 2 X
Z 1 X
Z 2 X
;

data Table2;   
input A1 $ A2 Y2 $;
datalines;
A 1 Y
A 5 Y
B 2 Y
C 1 Y
C 2 Y
D 1 Y
D 3 Y
E 3 Y
E 4 Y
F 1 Y
F 2 Y
G 2 Y
G 3 Y
;

proc sql;
  create table want as
  select 
    coalesce (left.a1, right.a1) as a1
  , coalesce (left.a2, right.a2) as a2
  , left.y1
  , right.y2
  from 
    table1 as left
  full join 
    table2 as right
  on
    left.a1 = right.a1 &
    left.a2 = right.a2
  group by
    calculated a1
  having 
    count (left.a1) > 0 and 
    count (right.a1) > 0
  order by
    a1, a2
  ;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 786 views
  • 0 likes
  • 6 in conversation