I would like to merge two dataset using one variable in one data set and multiple variables in the second data set. Here are the examples of the data sets. I need to use ID from data1 and merge on either ID1, ID2, or ID3 from data2.
data data1;
input ID VAR1 $;
datalines;
1 A
1 B
1 C
2 A
2 C
3 A
3 B
;
run;
data data2;
input ID1 ID2 ID3 VAR2;
datalines;
1 . . 22
1 . . 25
4 2 . 40
4 2 . 24
4 2 . 29
5 6 3 12
5 6 3 15
;
run;
It seems like you could just do this:
proc sql;
create table want as
select distinct a.ID, a.VAR1, b.ID1, b.ID2, b.ID3, b.VAR2
from
data1 A
left join
data2 B
on
a.ID=b.ID1
or
a.ID=b.ID2
or
a.ID=b.ID3;
quit;
Hi @trevand, can you show us what you'd like the output dataset(s) to look like?
You want to do this by using MERGE statement of data step or PROC SQL?
data data1(index=(id));
input ID VAR1 $;
datalines;
1 A
1 B
1 C
2 A
2 C
3 A
3 B
;
run;
data data2(index=(id));
input ID1 ID2 ID3 VAR2;
if not missing(id1) then do;id=id1;output;end;
if not missing(id2) then do;id=id2;output;end;
if not missing(id3) then do;id=id3;output;end;
keep id var2;
datalines;
1 . . 22
1 . . 25
4 2 . 40
4 2 . 24
4 2 . 29
5 6 3 12
5 6 3 15
;
run;
data want;
merge data1(in=ina) data2;
by id;
if ina;
run;
@Ksharp thanks! Could you give maybe an example using proc sql?
It seems like you could just do this:
proc sql;
create table want as
select distinct a.ID, a.VAR1, b.ID1, b.ID2, b.ID3, b.VAR2
from
data1 A
left join
data2 B
on
a.ID=b.ID1
or
a.ID=b.ID2
or
a.ID=b.ID3;
quit;
What result would you expect in case when "data2" has:
1 2 . 40
1 2 3 42
in its observations?
Bart
Something like this will produce result for all matches: single, double or even triple:
data data1;
input ID VAR1 $;
datalines;
1 A
1 B
1 C
2 X
2 Y
3 P
3 Q
;
run;
data data2;
input ID1 ID2 ID3 VAR2;
datalines;
1 . . 22
1 . . 25
4 2 . 40
4 2 . 24
4 2 . 29
5 6 3 12
5 6 3 15
1 2 . 40
1 2 3 42
;
run;
data want;
if 1=_N_ then
do;
if 0 then set data1;
declare hash H(dataset:"data1", multidata:"Y");
H.defineKey("id");
H.defineData("var1");
H.defineDone();
drop id;
end;
array xx id1-id3;
set data2;
do over xx;
if xx then
do while(H.do_over(key:xx)=0);
output;
end;
end;
run;
proc print;
run;
Bart
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.