BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jade_SAS
Pyrite | Level 9

Hi All,

    I have a quick question about writting SAS data back to Teradata.

    1) If I use the TPT (Teradata Parallel Transporter) API to load data.

LIBNAME TD TERADATA SERVER=*** DATABASE=***;

Data TD.TEST(TPT=YES FASTLOAD=YES TPT_MAX_SESSIONS=4);

set work.test123;

Run;

The process time is attached in Time1

 

   2) If I use the normal SAS way to load data

LIBNAME TD TERADATA SERVER=*** DATABASE=***;

Data TD.TEST;

set work.test123;

Run;

The process time is attached in Time2.

 

What puzzles me is why I need more time if loading data via 1), I thought it should be more efficient. Any comments, please!

 

Thank you!

Jade

1 ACCEPTED SOLUTION

Accepted Solutions
TomKari
Onyx | Level 15

It's important to understand the context. These options are typically called "bulk loader" options, because when you use them SAS will invoke a special loading utility of the DBMS in question.

 

I needed to use this option with a different database product a few years ago. If I didn't use the bulk loader, investigations showed that SAS was copying the data by doing a SQL "Insert" for each row. This approach was horrendously slow.

 

By using the bulk loader option, on the other hand, SAS would create a utility file with the data to be loaded, and then call the database load utility to load the file as a separate process. This worked over 100 times faster than the regular load syntax.

 

However, when using the regular syntax SAS only needs to create and run a series of simple SQL commands, as opposed to all of the overhead of creating a utility file, and then invoking the DBMS loader. So for small tests, like yours, I'm not surprised at all that the times are reversed.

 

For a fair test, try it on a data volume that takes at least 20 minutes to load with the regular load mechanism, and then see if you get an improvement with the bulk loader.

 

Tom

View solution in original post

9 REPLIES 9
Jade_SAS
Pyrite | Level 9

Thank you for the quick reply, KurtBremser! So it means my slow network can not make the parallel version work efficiently? Is there a way to fix?

General speaking, the parallel version should be faster, right?

 

Jade

 

Kurt_Bremser
Super User

The first step in optimizing is always to get proper knowledge of the problem. So you need to check first (in cooperation with your Teradata admins) where the bottleneck is.

Divide size of the data by time, and see how close that is to the maximum throughput of your network (or simply copy the SAS dataset to a network resource for comparison).

If network is not the culprit, watch the performance of the server(s). Maxed-out CPUs can't be optimized.

TomKari
Onyx | Level 15

It's important to understand the context. These options are typically called "bulk loader" options, because when you use them SAS will invoke a special loading utility of the DBMS in question.

 

I needed to use this option with a different database product a few years ago. If I didn't use the bulk loader, investigations showed that SAS was copying the data by doing a SQL "Insert" for each row. This approach was horrendously slow.

 

By using the bulk loader option, on the other hand, SAS would create a utility file with the data to be loaded, and then call the database load utility to load the file as a separate process. This worked over 100 times faster than the regular load syntax.

 

However, when using the regular syntax SAS only needs to create and run a series of simple SQL commands, as opposed to all of the overhead of creating a utility file, and then invoking the DBMS loader. So for small tests, like yours, I'm not surprised at all that the times are reversed.

 

For a fair test, try it on a data volume that takes at least 20 minutes to load with the regular load mechanism, and then see if you get an improvement with the bulk loader.

 

Tom

Jade_SAS
Pyrite | Level 9

That's make sense. I did try big data set and it's faster with TPT.

How can we dicide what size of data to use TPT or not? Is there a roughly size vable based on your experience? Thank you!

 

Jade

TomKari
Onyx | Level 15

I don't think there's any harm just using it all the time. The overhead it adds is minimal, and it's much easier than worrrying about putting it in and taking it out.

 

Tom

kiranv_
Rhodochrosite | Level 12

first of all I think you should have dbcreate_table_opts= 'primary index(columnof index)'); in your dataset options otherwise you may get skewed table in Teradata if your first column is not unique.

I feel there could be many reasons for the this, first of all workload at that particular point in time and also this run times are so small, if you run the same thing multiple times you may have slightly different times.

previusly i tried to bench mark before by running same query after few minutes, often yielding slightly  different results, this can attributed to workload at that point of time. if you really want to benchmark you need to run the tables with lot of data, keeping everything else constant.

One more reasons is fastload works by opening multiple sessions, there are limited amount of sessions per utility which can be pretty much used by other users and you have to wait for your turn.

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 1454 views
  • 3 likes
  • 4 in conversation