BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SuryaKiran
Meteorite | Level 14

Hi All,

 

- I did some test on different Teradata servers, it is taking almost the same time on all of them. When I tried to create a Volatile  table from Teradata SQL assist on the same server with the data that was there in the same server it took few seconds for 500,000.

- I tested with 100,000 records with only 1 column and have fixed length of 9 without duplicated and it took me 6 min in SAS

- I mentioned DBCOMMIT=0 @Patrick and MULTISTMT=YES, but if I see the log I think SAS is inserting by each row individually. 

 

SAS Log info: 

27  LIBNAME TEST TERADATA SERVER=ABC  USER="XXXXXXXX" PASSWORD="&PW." DBCOMMIT=0 CONNECTION=GLOBAL DBMSTEMP=YES ;
NOTE: Libref TEST was successfully assigned as follows: 
      Engine:        TERADATA 
      Physical Name: ABC
28         PROC DELETE DATA=test.temp1;
29         RUN;

TERADATA_19: Prepared: on connection 1
SELECT * FROM "temp1"
 
TERADATA: trforc: COMMIT WORK 
WARNING: File TEST.temp1.DATA does not exist.
NOTE: PROCEDURE DELETE used (Total process time):
      real time           0.09 seconds
      user cpu time       0.01 seconds
      system cpu time     0.00 seconds
      memory              119.50k
      OS Memory           36776.00k
      Timestamp           06/07/2017 10:06:21 PM
      Step Count                        29  Switch Count  38
      Page Faults                       0
      Page Reclaims                     14
      Page Swaps                        0
      Voluntary Context Switches        97
      Involuntary Context Switches      0
      Block Input Operations            0
      Block Output Operations           0
     
30         /* Creating a volatile table in Teradata from SAS Table*/
31         proc sql;
32           connect to teradata(SERVER=ABC  USER="XXXXXXX"
32       ! PASSWORD="&PW." CONNECTION=GLOBAL);

33          execute
34           (CREATE MULTISET VOLATILE TABLE temp1(Var CHAR(9))
35         	 NO PRIMARY INDEX ON COMMIT PRESERVE ROWS) 	by teradata;
 
TERADATA_20: Executed: on connection 1
CREATE MULTISET VOLATILE TABLE temp1(Var CHAR(9)) NO PRIMARY INDEX ON COMMIT PRESERVE ROWS
 
36           execute ( COMMIT WORK ) by teradata;
 
TERADATA_21: Executed: on connection 1
COMMIT WORK
 
37          quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.29 seconds
      user cpu time       0.01 seconds
      system cpu time     0.00 seconds
      memory              96.56k
      OS Memory           36776.00k
      Timestamp           06/07/2017 10:06:21 PM
      Step Count                        30  Switch Count  60
      Page Faults                       0
      Page Reclaims                     8
      Page Swaps                        0
      Voluntary Context Switches        187
      Involuntary Context Switches      0
      Block Input Operations            0
      Block Output Operations           8
      

38         
39         options sastrace=',,,d' sastraceloc=saslog;
 
TERADATA_22: Prepared: on connection 1
SELECT * FROM "temp1"
 
TERADATA: trforc: COMMIT WORK 
 
TERADATA_23: Prepared: on connection 1
SELECT * FROM "temp1"
 
40         proc append  base=TEST.temp1 (multistmt=yes ) data=GRIDWORK.sample1 FORCE ;
41         run;

NOTE: Appending GRIDWORK.SAMPLE1 to TEST.temp1.
 
TERADATA_24: Prepared: on connection 1
USING (A0A CHAR (9),A0B CHAR (9),A0C CHAR (9),A0D CHAR (9),A0E CHAR (9),A0F CHAR (9),A0G CHAR (9),A0H CHAR (9),A0I CHAR (9),A0J 
CHAR (9),A0K CHAR (9),A0L CHAR (9),A0M CHAR (9),A0N CHAR (9),A0O
................................................................................................................................................................................................................................................................ CHAR (9),B26 CHAR (9),B27 CHAR (9),B28 CHAR (9),B29 CHAR (9),B3A CHAR (9),B3B CHAR (9),B3C CHAR (9),B3D CHAR (9),B3E CHAR (9),B3F 
CHAR (9),B3G CHAR (9),B3H CHAR (9),B3I CHAR (9),B3J CHAR (9),B3K C4M CHAR (9),C4N CHAR (9),C4O CHAR (9),C4P CHAR (9),C4Q CHAR (9),C4R CHAR (9),C4S CHAR (9),C4T 
CHAR (9),C4U CHAR (9))INSERT INTO "temp1" ("Var")VALUES (:A0A);INSERT INTO "temp1" ("Var")VALUES (:A0B);INSERT INTO "temp1" 
("Var")VALUES (:A0C);INSERT INTO "temp1" ("Var")VALUES (:A0D);INSERT INTO "temp1" ("Var")VALUES (:A0E);INSERT INTO "temp1" 
("Var")VALUES (:A0F);INSERT INTO "temp1" ("Var")VALUES (:A0G);
................................................................................................................................................................................................
("Var")VALUES (:C4R);INSERT INTO "temp1" ("Var")VALUES (:C4S);INSERT INTO "temp1" ("Var")VALUES (:C4T);INSERT INTO "temp1" 
("Var")VALUES (:C4U);
 
NOTE: There were 100000 observations read from the data set GRIDWORK.SAMPLE1.
NOTE: 100000 observations added.
NOTE: The data set TEST.temp1 has . observations and 1 variables.
TERADATA: trforc: COMMIT WORK 
NOTE: PROCEDURE APPEND used (Total process time):
      real time           6:04.39
      user cpu time       0.08 seconds
      system cpu time     0.07 seconds
      memory              1339.15k
      OS Memory           38064.00k
      Timestamp           06/07/2017 10:12:26 PM
      Step Count                        31  Switch Count  1126
      Page Faults                       3
      Page Reclaims                     159
      Page Swaps                        0
      Voluntary Context Switches        5167
      Involuntary Context Switches      2
      Block Input Operations            0
      Block Output Operations           208
      


 

 

Thanks,
Suryakiran
JBailey
Barite | Level 11

Hi @SuryaKiran

 

Your results closely match what I experienced yesterday. I created a 2M row data set and it took 1:57 (one hour and 57 minutes) to move it into a VOLATILE table. I backed off the size to 50K rows (2 columns - $10 and $3). 

 

The first test I ran was a simple INSERT (no MULTISMT=YES) and it took slightly over 4 minutes to insert the 50k rows. Next, I specified TPT=NO MULTISTMT=YES on the target table. The 50K rows were INSERTed in just under 3 minutes. Which isn't great but it is ~33% faster than doing nothing. We can't use TPT=YES because it opens its own connection and cannot see the VOLATILE table. 

 

I used TPT=YES FASTLOAD=YES and was able to load the 2M row table in under 7 seconds. I think it is clear that you really want to use FASTLOAD. 

 

Given your situation, I would describe the problem to your DBA. They could create an empty table for you and allow you to FASTLOAD into it, and delete the contents. If this approach is taken be sure to check out the link, below, on DELETEing data fast. They may allow you to create a table in a playpen. Either of these approached would be greatly preferred to using a VOLATILE table. 

 

When dealing with your DBA express concern that by not being able to create a table you will be using more Teradata resources than required. Point out that if you aren't able to move the small set of data to the server then SAS may unnecessarily read large amounts of data from Teradata into SAS (SAS will perform the join when it is best performed by Teradata). 

 

When I was in SAS consulting I dealt with this type of thing quite often. If you explain it as "I am trying to make efficient use of database resources" you have a good chance of having them help you. 

 

So, ask for permission to create the table in a playpen and if they say "no" then ask them to create the table for you. Either way it is better than the VOLATILE table strategy. 

 

Once you have permission to create your table you will want to ensure that it is indexed proper and has up-to-date statistics. This will help make the join processing as efficient as possible. It may be fine to load the table, then take statistics one time. This could work because your table will be similar each time it is loaded because the data is deleted prior to each new load. I think I discuss statistics in the Troubleshooting SAS and Teradata Query Performance Problems paper.

 

You may find these helpful:

Effectively Moving SAS Data into Teradata

 

Troubleshooting SAS and Teradata Query Performance Problems

 

Teradata Parallel Transporter: Loading Your SAS Data Just Became Easier

 

The SQL Tuning Checklist: Making Slow Database Queries a Thing of the Past

 

DBIDIRECTEXEC: GOFAST=YES for Database Processing (SAS Communities Article)

 

DELETE Your Teradata Data Fast! (This will help if the DBA creates a table for you)

 

Best wishes,

Jeff

kiranv_
Rhodochrosite | Level 12

That is great insight @JBailey. So it is  loading from SAS to volatile table is what  causes  performance issues.

SuryaKiran
Meteorite | Level 14

Hi @JBailey,

 

These are great resources and very helpful. What if there are many users trying to join SAS table(1M) with Teradata table(300M), I think the DBA's will not provide each user with each table for scratch work. This will be a waste of resource and extra pain for them to manage these table usage properly especially very big organizations where there are thousands of SAS users. If there is a way to fastload data into volatile table then users doesn’t need any extra resources.

 

Even though it took me like 2.30 hours to send my SAS data into Volatile table and then do my joins in Teradata and get back my result into SAS, this is far better than the way doing the joins in SAS which took me almost 9 hours. I believe if there is a fastload for volatile table this can be done in few minutes.

 

By any chance is SAS Institute working on increasing the performance when loading SAS data into volatile table (FASTLOAD for Volatile Table)

       

Thanks,

Suryakiran

 

 

Thanks,
Suryakiran
kiranv_
Rhodochrosite | Level 12

fastload is teradata utility, so I donot think SAS can do anything there.

 all database tables are assigned to group rather than a person. It is resposibilty of the group to maintain their alloted perm space. At my previous work place all the staging permanent tables  were immediately dropped after their usage is done, usually drop staging table statement was there in almost all jobs.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 19 replies
  • 8467 views
  • 10 likes
  • 7 in conversation