Hi everyone,
I have two tables A and B and need to join them based on two variables from A which correspond to differently structured variables from B.
data a; input id $ var1 $ var2 $; datalines; 1 1 1
1 3 2
2 2 4
2 4 3 run;
data b; input var2 $ var1_1 $ var1_2 $ var1_3 $ var1_4 $; datalines; 1 a . b2 a .
2 . b1 b3 .
3 . . b4 b2
4 . . . a run;
data want; input id $ var1 $ var2 $ type $; datalines; 1 1 1 a
1 3 2 .
2 2 4 .
2 4 3 b2 ;
Thereby, the dots represent empty cells.
This left join would also be possible with Excel vlookup and two criteria, however Excel's performance is too weak for the actual datasets and I'm sure there is an elegant way to do this also with SAS in one step. Any suggestions of how to do this are appreciated.
data a(index=(var2));
input id $ var1 $ var2 $;
datalines;
1 1 1
1 3 2
2 2 4
2 4 3
;
data b(index=(var2));
input var2 $ var1_1 $ var1_2 $ var1_3 $ var1_4 $;
datalines;
1 a . b2 a .
2 . b1 b3 .
3 . . b4 b2
4 . . . a
;
data want;
merge a b;
by var2;
want=vvaluex(cats('var1_',var1));
drop var1_:;
run;
Normalize the "lookup" table. Then it is much easier to deal with. No need to keep the empty cells. Example using the same character variables.
data lookup
length row col $8 ;
input row @ ;
do col='1','2','3','4';
input type $ @;
if not missing(type) then output;
end;
datalines;
1 a . b2 a .
2 . b1 b3 .
3 . . b4 b2
4 . . . a
;
Now you can use SQL or data step MERGE to combine the two.
Or even load it into a hash object and use the FIND() hash object method to retrieve the TYPE.
But it probably makes more sense if you are trying to use VLOOKUP type capabilities if the index variables are numeric.
data lookup
row+1;
do col=1 to 4;
input type $ @;
if not missing(type) then output;
end;
datalines;
a . b2 a .
. b1 b3 .
. . b4 b2
. . . a
;
data a(index=(var2));
input id $ var1 $ var2 $;
datalines;
1 1 1
1 3 2
2 2 4
2 4 3
;
data b(index=(var2));
input var2 $ var1_1 $ var1_2 $ var1_3 $ var1_4 $;
datalines;
1 a . b2 a .
2 . b1 b3 .
3 . . b4 b2
4 . . . a
;
data want;
merge a b;
by var2;
want=vvaluex(cats('var1_',var1));
drop var1_:;
run;
Hello Ksharp,
thank you very much, your solution works fine.
Greetings
Assuming the table B is already sorted by VAR2, I would use a solution like this:
proc sort data=a;
by var2;
run;
data want;
merge a b;
by var2;
array var1_ (*) var1_1-var1_4;
type=var1_(var1);
drop var1_:;
run;
You may then want to sort back by ID and VAR1, to get data in the same order as your example.
But I do not understand the second observation in your WANT example: should the TYPE variable not be "b3" here (VAR1_3 from the row in B with VAR2=2)?
Thank you very much for all the contributions!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.