BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
paritosh176
Fluorite | Level 6

I want to create SAS dataset from DB2 table (27423639 records). Its taking about 1 hr 50 minutes. How can I optimize it? 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Well, if you have to pull all that over the network, a quick calculation resolves to a speed of 10.5 MB/sec, which is about par for a 100 Mbit network, but slow by an order of magnitude for a 1 Gbit network.

So you could either try to improve network throughput, or reduce the size of data that has to cross the network by using SQL pass-through to subset records and/or drop columns that are not needed.

View solution in original post

9 REPLIES 9
Reeza
Super User

Is it a straight extract? Are you pulling it across a network? 

 

27Million records isn't a huge file, probably about 5-8GB? But moving things across a network can take time and that's probably your bottleneck. 

 

How are you currently creating the dataset, a data step, proc sql, proc copy/datasets?

paritosh176
Fluorite | Level 6

Size of db2 table is around 89 GB(observation size 3520 bytes).

I am using proc sql to create dataset from db2 table.

Ksharp
Super User
1)Add option readbuff=100000 into your LIBNAME statement. 2)If you use Pass-Through SQL , that may give you faster .
paritosh176
Fluorite | Level 6

Thanks for your quick response.

I am using pass through and already tried readbuff with different values but no significant improvement in performance.

Kurt_Bremser
Super User

What is the observation size of the SAS dataset?

And how do you retrieve the data (post code with anonymized table names)?

 

I also guess that your network may be the cause, but ~2hrs for ~8 GB would translate to ~1MB/s, which would be very slow, but could be if you do it across a slower WAN connection.

paritosh176
Fluorite | Level 6

Size of db2 table is around 65GB(observation size 3520 bytes).

Want to create SAS dataset from this db2 table.

I have already used readbuff like options but no improvement.

Kurt_Bremser
Super User

Well, if you have to pull all that over the network, a quick calculation resolves to a speed of 10.5 MB/sec, which is about par for a 100 Mbit network, but slow by an order of magnitude for a 1 Gbit network.

So you could either try to improve network throughput, or reduce the size of data that has to cross the network by using SQL pass-through to subset records and/or drop columns that are not needed.

Ksharp
Super User
Another alternative way is using PROC DBLOAD . But it is very old.

The following code doesn't make you faster ?

libname xx db2 dsn=.........    readbuff=100000;
proc copy in=xx out=work;
 select xxxx;
run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2781 views
  • 2 likes
  • 4 in conversation