Need look up solution help without a sort or Proc SQL?

Reply
Contributor
Posts: 72

Need look up solution help without a sort or Proc SQL?

Hello Everybody,

I need look up solution help without a sort/merge or Proc SQL?

I have a table A /* 1 million records*/

For example:/*sample beverage market research data*/

B_id       Client_name        Client_id /*Can have duplicates but shouldn't matter I hope*/

2             carlsberg            A146

7             kronenbourg        b123

5             stella artois         c987

23           tuborg                 g143

45           budweiser            h765

73           Miller                   b876

Table B /*10 million records*/

B_id  Var1 Var2 Var3  Var4 Var5

5      x      y     z       23    25

5      a      b     c       45    24

5      j       k     l        34    65

WANT:

B_Id in Table B should find a match in Table A and get the value of client_name and client_id and these two columns to result.

For example for the above two inputs the result should presumably look like the following:

B_id  Var1 Var2 Var3  Var4 Var5  Client_name   client_id

5      x      y     z       23    25             stella artois    c987

5      a      b     c       45    24            stella  artois    c987         

5      j       k     l        34    65             stella artois    c987

And if a match is not found, I would like to terminate the entire execution of the entire script that follows the look up and write a note saying match not exist or found.

Super User
Posts: 17,840

Re: Need look up solution help without a sort or Proc SQL?

If a single B_ID is not in lookup table then you terminate? I would run that check first using a SQL IN to verify and then do the mapping.

You can use proc format if you don't want a sort/sql join.

Super User
Posts: 10,500

Re: Need look up solution help without a sort or Proc SQL?

What's wrong with using Proc SQL for this?

Super User
Posts: 5,084

Re: Need look up solution help without a sort or Proc SQL?

A good general approach:  Use a DATA step.  Load Table A into a hash table, and for each Table B record attempt to look up a match.

Issues to consider before thinking about programming syntax:

When you say that Table A can contain duplicates, does that mean duplicates for B_id or duplicates for the other variables?  If B_id could have duplicates, you need to spell out what the results should look like (as well as be prepared for more complex programming).  Duplicates for Client-related variables wouldn't make a difference, as long as B_id is unique.

In general, a format vs. a hash table are often viable alternatives.  Since you have two fields to look up, a hash table would be easier.

A DATA step could easily write a message to the log, and then end the program.  However, note that if you are running through EG, this would boot you out of EG entirely.  You could circumvent that, but the programming would be more complex.

So ... what are the conditions of contest?  A batch job?  Duplicates vs. uniqueness of B_id in Table A?  If  non-match is found, should the current DATA step end immediately, or should it complete first and then terminate all subsequent steps?

Contributor
Posts: 72

Re: Need look up solution help without a sort or Proc SQL?

Sorry, I actually meant to say Duplicates for B_ID. Basically, the idea is to just retrieve client name and client id using B_ID as look up key.  Table B has information of one client for the whole file. For example, B_ID with value 5 can represent one brewery ID that caters the northern California region. So, Table B will not have any other id in a single file./*the result is still the same as I have written in the WANT sample*/

Reeza's idea of using SQL filter to check for presence of B_ID in look table will certainly help, so that to perform look up or not can be decided in advance.

The reason reason Table A may have duplicates is because it may contain other dimensions(variable information regarding sales and output) that I am not interested for my task.

Frequent Contributor
Posts: 115

Re: Need look up solution help without a sort or Proc SQL?

Would this help?/* Just following up 's suggestion

data in;
input B_ID $ Client_name & $20. Client_Id $;
datalines;
2             carlsberg            A146
7             kronenbourg        b123
5             stella artois         c987
23           tuborg                 g143
45           budweiser            h765
73           Miller                   b876
;
data in2;
input (B_id  Var1 Var2 Var3  Var4 Var5) ($);
datalines;
5      x      y     z       23    25
5      a      b     c       45    24
5      j       k     l        34    65
;


data results;
if 0 then set in;
if _N_ = 1 then do;
declare hash h(dataset:'in');
h.defineKey('b_id');
h.defineData('client_name', 'client_id');
h.defineDone();
end;
set in2;
if h.find() = 0 then
output;
run;

Naveen Srinivasan

Contributor
Posts: 72

Re: Need look up solution help without a sort or Proc SQL?

Thanks Naveen, that worked Smiley Happy

Ask a Question
Discussion stats
  • 6 replies
  • 271 views
  • 1 like
  • 5 in conversation