BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
nemez
Fluorite | Level 6

Hi everyone,

I have two tables A and B and need to join them based on two variables from A which correspond to differently structured variables from B.

data a;
  input id $ var1 $ var2 $;
datalines;
1 1 1
1 3 2
2 2 4
2 4 3 run;
data b;
  input var2 $ var1_1 $ var1_2 $ var1_3 $ var1_4 $;
datalines;
1 a . b2 a .
2 . b1 b3 .
3 . . b4 b2
4 . . . a run;
data want;
  input id $ var1 $ var2 $ type $;
datalines;
1 1 1 a
1 3 2 .
2 2 4 .
2 4 3 b2 ;

Thereby, the dots represent empty cells.

This left join would also be possible with Excel vlookup and two criteria, however Excel's performance is too weak for the actual datasets and I'm sure there is an elegant way to do this also with SAS in one step. Any suggestions of how to do this are appreciated. 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data a(index=(var2));
  input id $ var1 $ var2 $;
datalines;
1 1 1
1 3 2
2 2 4
2 4 3
;
data b(index=(var2));
  input var2 $ var1_1 $ var1_2 $ var1_3 $ var1_4 $;
datalines;
1 a . b2 a .
2 . b1 b3 .
3 . . b4 b2
4 . . . a
;

data want;
 merge a b;
 by var2;
 want=vvaluex(cats('var1_',var1));
 drop var1_:;
run;

View solution in original post

6 REPLIES 6
Tom
Super User Tom
Super User

Normalize the "lookup" table.  Then it is much easier to deal with. No need to keep the empty cells.  Example using the same character variables.

data lookup
  length row col $8 ;
  input row @ ;
  do col='1','2','3','4';
    input type $ @;
    if not missing(type) then output;
  end;
datalines;
1 a . b2 a .
2 . b1 b3 .
3 . . b4 b2
4 . . . a
;

Now you can use SQL or data step MERGE to combine the two.

Or even load it into a hash object and use the FIND() hash object method to retrieve the TYPE.

 

But it probably makes more sense if you are trying to use VLOOKUP type capabilities if the index variables are numeric.

data lookup
  row+1;
  do col=1 to 4;
    input type $ @;
    if not missing(type) then output;
  end;
datalines;
a . b2 a .
. b1 b3 .
. . b4 b2
. . . a
;

 

Ksharp
Super User
data a(index=(var2));
  input id $ var1 $ var2 $;
datalines;
1 1 1
1 3 2
2 2 4
2 4 3
;
data b(index=(var2));
  input var2 $ var1_1 $ var1_2 $ var1_3 $ var1_4 $;
datalines;
1 a . b2 a .
2 . b1 b3 .
3 . . b4 b2
4 . . . a
;

data want;
 merge a b;
 by var2;
 want=vvaluex(cats('var1_',var1));
 drop var1_:;
run;
nemez
Fluorite | Level 6

Hello Ksharp,

 

thank you very much, your solution works fine.

 

Greetings

s_lassen
Meteorite | Level 14

Assuming the table B is already sorted by VAR2, I would use a solution like this:

proc sort data=a;
  by var2;
run;

data want;
 merge a b;
 by var2;
 array var1_ (*) var1_1-var1_4;
 type=var1_(var1);
 drop var1_:;
run;

You may then want to sort back by ID and VAR1, to get data in the same order as your example.

 

But I do not understand the second observation in your WANT example: should the TYPE variable not be "b3" here (VAR1_3 from the row in B with VAR2=2)?

nemez
Fluorite | Level 6
Thank you for your suggestion and you are right about my mistake in the want example. However, I have not yet tested your approach.
nemez
Fluorite | Level 6

Thank you very much for all the contributions!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 606 views
  • 1 like
  • 4 in conversation