DATA Step, Macro, Functions and more

Simulate Proc sql nested query in Datastep

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 86
Accepted Solution

Simulate Proc sql nested query in Datastep

Hi All

If we have two set of data let's say A and B. Both of them contains the list of ID. I would like to select list of accounts in A which are in B

To do so, I can write sql query like this:

Proc sql;

Create Table temp as

   Select acct from

     A

     where acct in ( Select acc from B);

quit;

The problem is now, I do have an existing sas script which I need to modify  where large number of nested conditions are specified in datasteps. How can I apply this condition in same Datastep.

1 possibility is to have all the list of accounts in B as in macro variable and can apply where acct in (&acc_list) but this will not suffice in my case since the number of accounts are in millions and macro variable value has some limitations.

Any suggestions.


Accepted Solutions
Solution
‎01-11-2013 03:49 AM
Super User
Posts: 5,257

Re: Simulate Proc sql nested query in Datastep

Since I don't have access to your SAS script, I'm not sure if I will give you the best advice.

If you want to mimic left joins in the data step (assuming tha ID in table B is unique), you can use MERGE BY-logic, combined with data set option IN=, like

data c;

     merge a(in=a)

               b(in=b);

     by acc;

     if a and b;

     ....

run;

Another option could by to import the ID:s from b into a hash table. Probably better performance, but requires some more programming.

A third option is to build a format on ID on b, with a specific label on "other", the us that in a where clause:

where put(acc,acc_fmt.) not = 'ACC does not exist';

In this case assumed ACC is numeric.

Data never sleeps

View solution in original post


All Replies
Solution
‎01-11-2013 03:49 AM
Super User
Posts: 5,257

Re: Simulate Proc sql nested query in Datastep

Since I don't have access to your SAS script, I'm not sure if I will give you the best advice.

If you want to mimic left joins in the data step (assuming tha ID in table B is unique), you can use MERGE BY-logic, combined with data set option IN=, like

data c;

     merge a(in=a)

               b(in=b);

     by acc;

     if a and b;

     ....

run;

Another option could by to import the ID:s from b into a hash table. Probably better performance, but requires some more programming.

A third option is to build a format on ID on b, with a specific label on "other", the us that in a where clause:

where put(acc,acc_fmt.) not = 'ACC does not exist';

In this case assumed ACC is numeric.

Data never sleeps
Frequent Contributor
Posts: 86

Re: Simulate Proc sql nested query in Datastep

You' re right. But so far I am not aware of Hash merging. The problem is with the existing code is already there and I would just have to enhance it without changing the structure really. So, I was looking for options basically.

Occasional Contributor
Posts: 14

Re: Simulate Proc sql nested query in Datastep

Hi bnarang,

I am not sure, what is u r request. If your request is select list of accounts from A and which are in B table. Please check this below..

proc sql;

create table res from

(select A.acct from A,B where A.acct=B.acct);

run;

Frequent Contributor
Posts: 127

Re: Simulate Proc sql nested query in Datastep

Hi Bnarang,

What kind of nested conditions are specified in these datasteps that you want to include in your proc sql ?

I think you might simply add them under the 'where' or 'having' clause, depending whether their scope is simple or aggregated data.

Proc sql;

   Create Table temp as

   Select acct

    from A

    where acct in (Select acc from B where <condition on table B>)

        and <condition on table A>;

Quit;

Kind regards,

Florent

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 337 views
  • 0 likes
  • 4 in conversation