Teradata performance issue

AJ_Brien · Posted 08-11-2019 05:36 PM

Hello,

I've been running a query on teradata, and my source table is huge- 3-4M records. the code has been running for a while now so I was wondering what can I do to make this faster. I also tried diving my source dataset into 50 groups (around 75k obs per dataset), but the first group query also took forever and I had no other option but to terminate it. Would appreciate any suggestions.

libname TD teradata user="%sysget(USER)" password="&pwd." tdpid=rchtera mode=ansi database=user_TD;

proc sql;
create table TD.table1 as
select *
from check; /*check is my base table*/
quit;

Proc SQL;
CONNECT TO teradata (user="%sysget(USER)" password="&pwd." tdpid=rchtera mode=teradata); 

CREATE TABLE new AS 
Select *
From connection to teradata
	(SELECT abc.data1,
			                CAST(abc.data2 AS CHAR(19))  AS data2
							, abc.data3
				from		user_TD.table1 AS table
				left join 	[teradata table]			abc		on	table.data2 = data2 
				where abc.data1 not in (select data1 from table)
and data2 in (select data2 from table));
DISCONNECT FROM TERADATA;
QUIT;

SASKiwi · Posted 08-11-2019 08:46 PM

To test whether a slow network is part of your problem change your query like so and compare run times:

Replace this: CREATE TABLE new AS 
Select *

With this: select count(*) as Row_Count

ChrisNZ · Posted 08-11-2019 11:35 PM

1. in() clauses can be much slower than joins, depending how the optimiser does its job.

2. I am unsure of the purpose of this:

and data2 in (select data2 from table)

High-Performance SAS Coding - Third Edition

Teradata performance issue

Re: Teradata performance issue

Re: Teradata performance issue

Registration is open

Call for Content EXTENDED

Teradata performance issue

Re: Teradata performance issue

Re: Teradata performance issue

Registration is open

Call for Content EXTENDED

SAS Training: Just a Click Away