BookmarkSubscribeRSS Feed
avinashjha1787
Calcite | Level 5

I am trying to load a MS-SQL server table to CAS via Proc Casutil Load. The table has more than 100000 records with 2 columns of type NVARCHAR(MAX) which are really big string variables. The load table takes around 4hrs to complete. My environment has 5 worker nodes each of 1TB.

 

What are the options I have to speed up the load ?

3 REPLIES 3
Patrick
Opal | Level 21

You could use numReadNodes=0 to ensure you're using all your workers for multi-node data transfer and then also look into sliceColumn= and sliceExpressions= in order to get evenly distributed data.  

Patrick
Opal | Level 21

And I forgot to mention readBuff= . You might have to do some testing to figure out the optimal number of rows to fetch in one go.

Mazi
Pyrite | Level 9
Do you need all 100K+ rows?

Though this doesn’t seem much IMHO.

You might want to look at PROC CAS
Table.loadTable action as you can add a where clause to this and other parameters that are not available to PROC CASUTIL.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Discussion stats
  • 3 replies
  • 682 views
  • 0 likes
  • 3 in conversation