I have a list of of about 20,000 ids (length $20.). There is a SQL Server database table on a server somewhere with about 2 billion records. I need to get all the records from that table for my list of ids. I work on a 64-bit PC and connect to the database with an OLEDB libname A. I tried the following code: proc sql; create table NEW as select * from id_list i, A.bigTable t where i.id = t.id; quit; This took forever to run and failed because my server connection timed out at 8 hours. So I added a variable to id_list (obsnum=_n_;) and modified the syntax to: create table NEW as select * from id_list (where=(1 le obsnum le 1000)) i, A.bigTable t where i.id = t.id; This took 12 seconds (real time) to run. So I wrote a macro to do this and ran it 20 times, giving it a new thousand-id range each time. Then I thought, why not try 2000 ids at a time, instead of 1000. 30 minutes later, this query was not complete so I stopped it. My question is: what accounts for the huge increase in time when I try to get records for more than 1000 ids at a time? Is there a better method to go against a very large table to find records?
... View more