BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bnarang
Calcite | Level 5

Hi All

If we have two set of data let's say A and B. Both of them contains the list of ID. I would like to select list of accounts in A which are in B

To do so, I can write sql query like this:

Proc sql;

Create Table temp as

   Select acct from

     A

     where acct in ( Select acc from B);

quit;

The problem is now, I do have an existing sas script which I need to modify  where large number of nested conditions are specified in datasteps. How can I apply this condition in same Datastep.

1 possibility is to have all the list of accounts in B as in macro variable and can apply where acct in (&acc_list) but this will not suffice in my case since the number of accounts are in millions and macro variable value has some limitations.

Any suggestions.

1 ACCEPTED SOLUTION

Accepted Solutions
LinusH
Tourmaline | Level 20

Since I don't have access to your SAS script, I'm not sure if I will give you the best advice.

If you want to mimic left joins in the data step (assuming tha ID in table B is unique), you can use MERGE BY-logic, combined with data set option IN=, like

data c;

     merge a(in=a)

               b(in=b);

     by acc;

     if a and b;

     ....

run;

Another option could by to import the ID:s from b into a hash table. Probably better performance, but requires some more programming.

A third option is to build a format on ID on b, with a specific label on "other", the us that in a where clause:

where put(acc,acc_fmt.) not = 'ACC does not exist';

In this case assumed ACC is numeric.

Data never sleeps

View solution in original post

4 REPLIES 4
LinusH
Tourmaline | Level 20

Since I don't have access to your SAS script, I'm not sure if I will give you the best advice.

If you want to mimic left joins in the data step (assuming tha ID in table B is unique), you can use MERGE BY-logic, combined with data set option IN=, like

data c;

     merge a(in=a)

               b(in=b);

     by acc;

     if a and b;

     ....

run;

Another option could by to import the ID:s from b into a hash table. Probably better performance, but requires some more programming.

A third option is to build a format on ID on b, with a specific label on "other", the us that in a where clause:

where put(acc,acc_fmt.) not = 'ACC does not exist';

In this case assumed ACC is numeric.

Data never sleeps
bnarang
Calcite | Level 5

You' re right. But so far I am not aware of Hash merging. The problem is with the existing code is already there and I would just have to enhance it without changing the structure really. So, I was looking for options basically.

NagendraKumarK
Calcite | Level 5

Hi bnarang,

I am not sure, what is u r request. If your request is select list of accounts from A and which are in B table. Please check this below..

proc sql;

create table res from

(select A.acct from A,B where A.acct=B.acct);

run;

Florent
Quartz | Level 8

Hi Bnarang,

What kind of nested conditions are specified in these datasteps that you want to include in your proc sql ?

I think you might simply add them under the 'where' or 'having' clause, depending whether their scope is simple or aggregated data.

Proc sql;

   Create Table temp as

   Select acct

    from A

    where acct in (Select acc from B where <condition on table B>)

        and <condition on table A>;

Quit;

Kind regards,

Florent

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1828 views
  • 0 likes
  • 4 in conversation