BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bnarang
Calcite | Level 5

Hi All

If we have two set of data let's say A and B. Both of them contains the list of ID. I would like to select list of accounts in A which are in B

To do so, I can write sql query like this:

Proc sql;

Create Table temp as

   Select acct from

     A

     where acct in ( Select acc from B);

quit;

The problem is now, I do have an existing sas script which I need to modify  where large number of nested conditions are specified in datasteps. How can I apply this condition in same Datastep.

1 possibility is to have all the list of accounts in B as in macro variable and can apply where acct in (&acc_list) but this will not suffice in my case since the number of accounts are in millions and macro variable value has some limitations.

Any suggestions.

1 ACCEPTED SOLUTION

Accepted Solutions
LinusH
Tourmaline | Level 20

Since I don't have access to your SAS script, I'm not sure if I will give you the best advice.

If you want to mimic left joins in the data step (assuming tha ID in table B is unique), you can use MERGE BY-logic, combined with data set option IN=, like

data c;

     merge a(in=a)

               b(in=b);

     by acc;

     if a and b;

     ....

run;

Another option could by to import the ID:s from b into a hash table. Probably better performance, but requires some more programming.

A third option is to build a format on ID on b, with a specific label on "other", the us that in a where clause:

where put(acc,acc_fmt.) not = 'ACC does not exist';

In this case assumed ACC is numeric.

Data never sleeps

View solution in original post

4 REPLIES 4
LinusH
Tourmaline | Level 20

Since I don't have access to your SAS script, I'm not sure if I will give you the best advice.

If you want to mimic left joins in the data step (assuming tha ID in table B is unique), you can use MERGE BY-logic, combined with data set option IN=, like

data c;

     merge a(in=a)

               b(in=b);

     by acc;

     if a and b;

     ....

run;

Another option could by to import the ID:s from b into a hash table. Probably better performance, but requires some more programming.

A third option is to build a format on ID on b, with a specific label on "other", the us that in a where clause:

where put(acc,acc_fmt.) not = 'ACC does not exist';

In this case assumed ACC is numeric.

Data never sleeps
bnarang
Calcite | Level 5

You' re right. But so far I am not aware of Hash merging. The problem is with the existing code is already there and I would just have to enhance it without changing the structure really. So, I was looking for options basically.

NagendraKumarK
Calcite | Level 5

Hi bnarang,

I am not sure, what is u r request. If your request is select list of accounts from A and which are in B table. Please check this below..

proc sql;

create table res from

(select A.acct from A,B where A.acct=B.acct);

run;

Florent
Quartz | Level 8

Hi Bnarang,

What kind of nested conditions are specified in these datasteps that you want to include in your proc sql ?

I think you might simply add them under the 'where' or 'having' clause, depending whether their scope is simple or aggregated data.

Proc sql;

   Create Table temp as

   Select acct

    from A

    where acct in (Select acc from B where <condition on table B>)

        and <condition on table A>;

Quit;

Kind regards,

Florent

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2512 views
  • 0 likes
  • 4 in conversation