Hello,
I have two datasets to merge by two variables as in the example.
the first variables is common to both dataset (in the example key0).
the second is called "key" in the second dataset and can be either "key1" or "key2" or both or neither in the first dataset.
I would like to match each record of the second dataset by key0 and the first matchable between key1 and key2.
moreover I would like have a variable "keymach" that tells which of the two variables (key1 and/or key2) is equal to key. .
any hint is appreciated .
moreover the data are very large dbms tables,so any suggestion to have an efficient code is preferred.
thank you very much in advance
data a;
input key0 key1 key2 value_A;
cards;
1 1 1 10
1 2 3 15
1 3 4 20
;
data b;
input key0 key;
cards;
1 1
1 2
1 3
1 4
;
run;
data want;
input key0 key1 key2 value_A key keymatch $2.;
cards;
1 1 1 10 1 11
1 2 3 15 2 10
1 2 3 15 3 01
1 3 4 20 4 01
;
run;
This seems to match your specification:
proc sql;
create table want as select
a.key0,
a.key1,
a.key2,
a.value_A,
b.key,
cats(b.key=a.key1,b.key=a.key2) length=2 as keymatch
from a,b
where a.key0=b.key0
and (a.key1=b.key or
a.key2=b.key)
;
quit;
But it does not quite match the data you present. Why is the second last row in B not matched to the last row in A in your example?
This is just a start as I don't really understand your want dataset:
data a; input key0 key1 key2 value_A; cards; 1 1 1 10 1 2 3 15 1 3 4 20 ; run; data b; input key0 key; cards; 1 1 1 2 1 3 1 4 ; run; proc transpose data=b out=inter prefix=val; by key0; var key; run; data want; merge a inter; by key0; array val{4}; do i=1 to 4; if val{i}=key1 and key=. then key=i; end; run;
So basically transpose the second table up, merge on key0 then use an array to find the data you want.
This seems to match your specification:
proc sql;
create table want as select
a.key0,
a.key1,
a.key2,
a.value_A,
b.key,
cats(b.key=a.key1,b.key=a.key2) length=2 as keymatch
from a,b
where a.key0=b.key0
and (a.key1=b.key or
a.key2=b.key)
;
quit;
But it does not quite match the data you present. Why is the second last row in B not matched to the last row in A in your example?
Thank you.
it should do what I need (there was an error in my data example).
I prefer the sql solution over the transpose + merge since I hope to pass it to the DBMS for execution.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.