Thanks for the clarification. I think I understand the issue better.
I think your problem is similar to one I ran into and has to do with whether the query is actually being processed on the database server or on your SAS machine.
When you run the query with the actual key value, it runs quickly because the entire query is passed to the database where it is processed and only the relevant records are passed back to SAS which then puts the results into a table.
When you run the query with a subquery that uses an existing SAS dataset, that query can't be entirely passed to the database because of the reference to the SAS dataset. In this case, the database sends back the entire table B (all 20 million rows) and SAS does the processing of subsetting this table with the SAS dataset created in query A. One way you can see this is by watching the temporary file that is being created while the query is running. You'll probably see it grow to some very large size, and then near the end it will be replaced with a smaller file representing the merge of table B with IPJOIN.
Subqueries work efficiently when they can be passed entirely to the database for processing. Whenever possible, I don't recommend using SAS datasets in subqueries to databases unless the database tables are small. On the other hand, I've found that using macro variables in the query doesn't hurt performance.
Can you try this query? It uses the query you used to build IPJOIN as the subquery for IPBET, but doesn't use any SAS datasets. The side benefit is that your programming will be more efficient: one less query to the database and one less SAS dataset created. I have a feeling this will run faster. Let us know how it works.
[pre]
PROC SQL;
CREATE TABLE IPBET AS
SELECT client_key, ip_addr
FROM tableB
WHERE client_key IN (SELECT DISTINCT client_key
FROM tableA
WHERE acct_no = &acctno);
QUIT;
[/pre]
... View more