Hello, I need help finding a way to identify if there's a match between 2 columns with a group by criteria. For example
data tempp;
length first last col_a col_b $10;
input first last col_a col_b;
datalines;
jone smith aa bb
jone smith cc bb
jone smith dd cc
jone smith ee aa
peter paul zz dd
peter paul pp dd
peter paul aa cc
peter paul bb aa
;
run;
We see that for jone smith, there's a matching 'aa' in col_a and col_b, same for peter paul. Is there a way to search within a group (here it's by person) and see if there's a match between 2 columns?
Thanks in advance for any help!
There's also a second one for Jone?
data tempp;
length first last col_a col_b $10;
input first last col_a col_b;
datalines;
jone smith aa bb
jone smith cc bb
jone smith dd cc
jone smith ee aa
peter paul zz dd
peter paul pp dd
peter paul aa cc
peter paul bb aa
;
run;
proc sql;
create table want as
select
t1.first,
t1.last,
t1.col_a,
t2.col_b
from
tempp as t1
inner join
tempp as t2
on t1.first = t2.first and
t1.last = t2.last and
t1.col_a = t2.col_b;
quit;
Is that what you wanted?
Wow I didn't even consider inner joining with itself... I can definitely work with this, thank you so much!! 🙂
Alternatively
data tempp;
length first last col_a col_b $10;
input first last col_a col_b;
datalines;
jone smith aa bb
jone smith cc bb
jone smith dd cc
jone smith ee aa
peter paul zz dd
peter paul pp dd
peter paul aa cc
peter paul bb aa
;
run;
data want;
if _N_ = 1 then do;
dcl hash h(dataset : "tempp", multidata : 'Y');
h.definekey('first', 'last', 'col_b');
h.definedone();
end;
set tempp;
by first last;
if h.check(key : first, key : last, key : col_a) = 0;
run;
Result
first last col_a col_b jone smith aa bb jone smith cc bb peter paul aa cc
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.