Note that in TERADATA the PRIMARY INDEX is used to determine how to distribute the data across the compute nodes. You want a primary index that is balanced so that approximately the same number of observations will be processed by each compute node. Otherwise you cannot take advantage of the massive parallel computing power of Teradata engine.
Where in the process is it "timing out"?
Is it when running the query in Teradata?
If so then ask your Teradata admin team to help you optimize the query.
Is it when trying to save the query back to TERADATA? Make sure that you are not making a skewed table in TERADATA by using an appropriate PRIMARY INDEX.
Is it in trying to download the data to SAS? How much data are you downloading? Do you need to download so much data? Could you instead summarize it in Teradata and only return to SAS the summary (aka smaller) data?
@SAS_in_Florida wrote:
The data transfer into Teradata completes without error. The timeout happens during the inner join. I will look into finding an appropriate Teradata admin to get their input. Thank you for the suggestion.
@SAS_in_Florida It's certainly a good idea to have a Teradata DBA look into the reason of why the inner join takes so long.
One strategy for improving performance that worked for me in the past for such a heterogenous inner join situation was to generate SQL with a where clause instead of an inner join. For many databases the limit for a single SQL is 32kb so you need to generate and send multiple queries.
Here some sample code to illustrate the approach.
data tera_big;
val='1';
do key=1 to 1000000;
do j=1 to 100;
output;
end;
end;
run;
data sas_small;
do key=1 to 1000000 by 100;
output;
end;
run;
%macro sql_pump(keylist);
proc sql;
create table work.result_set as
select key,val
from tera_big
where key in (%unquote(&keylist))
;
quit;
proc datasets lib=work nolist;
append data=work.result_set base=work.want;
run;
delete result_set;
run;
quit;
%mend;
data _null_;
set sas_small end=last;
by key;
if last.key;
length keylist $32000;
retain keylist;
keylist=catx(',',keylist,put(key,f16. -l));
if _n_=1 then
do;
call execute('proc datasets lib=work nolist nowarn; delete want; quit;');
end;
if last or length(keylist)>31000 then
do;
call execute( cats('%sql_pump(%nrstr(',keylist,'));') );
call missing(keylist);
output;
end;
run;
The number of SQLs you send needs to be relatively small for this to potentially outperform the temp table approach. It should be o.k. for 10K keys.
If it's for character then just also add single quotes around the values.
keylist=catx(',',keylist,cats("'",key,"'"));
And I assume this method generates a string too long to submit to the server?
%let age = 14;
data test;
set sashelp.class end = eof;;
where age = &age.;
if _n_ =1 then do;
call execute('proc sql; select * from sashelp.class where name in (');
end;
if not eof then do;
call execute(quote(name));
call execute(",");
end;
if eof then do;
call execute(quote(name));
call execute("); quit;");
end;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.