DATA Step, Macro, Functions and more

Subseting Teradata table with SAS dataset

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 75
Accepted Solution

Subseting Teradata table with SAS dataset

Happy New Year SAS Communities...!

Here is my problem question

I have ~90 Million record in my master table (Teradata table) and I have another SAS dataset with 1 Million record.
Both of these table contains account numbers.

I want to query the teradata table based on account number using the account numbers from SAS dataset.

I am using the following code to do that

LIBNAME X "/dhana/inputfile";

PROC SQL;

RESET INOBS=MAX OUTOBS=MAX LOOPS=MAX NOFLOW NOFEEDBACK NOPROMPT NONUMBER ;

CONNECT TO TERADATA(TDPID=TERADT USER=%SYSGET(USER) PASSWORD=%SYSGET(PASSWD));

CREATE TABLE X.ALL_ACCTS AS

SELECT * FROM CONNECTION TO TERADATA(

SELECT VAR1, VAR2, VAR3, VAR4, VAR5, VAR5

FROM TABLE1

WHERE ACCT_NBR IN( SELECT ACCT_NBR FROM X.SAS_DATASET)

)
;

QUIT;

It looks like I can't use the SAS dataset in the Teradata SQL pass through query.
I am getting the following error message - Teradata prepare: Database 'X' does not exist.
Is there any way I can subset the teradata table with the account numbers from my SAS dataset.

Thanks

Dhanas


Accepted Solutions
Solution
‎01-02-2012 04:28 PM
SAS Employee
Posts: 104

Subseting Teradata table with SAS dataset

To simplify matters, use implicit passthrough with the MULTI_DATASRC_OPT= option on the LIBNAME statement, and the DBMASTER= dataset option on the Teradata table reference during the join. 

The code will look SOMETHING like this:

 
LIBNAME X "C:\MY_SASDATA";
LIBNAME TD teradata  TDPID=TERADT USER="%SYSGET(USER)" 
   PASSWORD="%SYSGET(PASSWD)" multi_datasrc_opt=in_clause;

PROC SQL;
CREATE TABLE X.ALL_ACCTS AS
   SELECT VAR1, VAR2, VAR3, VAR4, VAR5, VAR5
      FROM TD.TABLE1(dbmaster=yes) AS TD
         , X.SAS_DATASET AS SAS
      WHERE TD.ACCT_NBR =SAS.ACCT_NBR 
;
QUIT;

View solution in original post


All Replies
Solution
‎01-02-2012 04:28 PM
SAS Employee
Posts: 104

Subseting Teradata table with SAS dataset

To simplify matters, use implicit passthrough with the MULTI_DATASRC_OPT= option on the LIBNAME statement, and the DBMASTER= dataset option on the Teradata table reference during the join. 

The code will look SOMETHING like this:

 
LIBNAME X "C:\MY_SASDATA";
LIBNAME TD teradata  TDPID=TERADT USER="%SYSGET(USER)" 
   PASSWORD="%SYSGET(PASSWD)" multi_datasrc_opt=in_clause;

PROC SQL;
CREATE TABLE X.ALL_ACCTS AS
   SELECT VAR1, VAR2, VAR3, VAR4, VAR5, VAR5
      FROM TD.TABLE1(dbmaster=yes) AS TD
         , X.SAS_DATASET AS SAS
      WHERE TD.ACCT_NBR =SAS.ACCT_NBR 
;
QUIT;
Respected Advisor
Posts: 4,173

Re: Subseting Teradata table with SAS dataset

I really like SASJedi's approach. The only possible issue I see with it is:

The SAS Doc for MULTI_DATASRC_OPT states "Currently, the IN clause has a limit of 4,500 unique values".

http://support.sas.com/documentation/cdl/en/acreldb/63647/HTML/default/viewer.htm#a002205740.htm

An alternative approach could be to load the SAS dataset into a temporary table in Teradata and then join these 2 tables.

http://support.sas.com/documentation/cdl/en/acreldb/63647/HTML/default/viewer.htm#a002677096.htm

Frequent Contributor
Posts: 75

Re: Subseting Teradata table with SAS dataset

Thanks SASJedi...! 

This is really an excellent option and answer. I tried this option it worked perfectly. But I have faced 2 problems.

1. When I try to merge one teradata table with SAS dataset , I have found an interesting thing. Though the SAS dataset contains approximately 1 Million record and all the account numbers are

     present in the teradata table, the query returned only ~15718 records.

Can you help me why I am getting very less record returned from the query.

2.  When I try to merge 7 teradata tables along with the SAS dataset then I got the following error.

ERROR: Teradata connection: TWM Workload violation: Limit of 7 concurrent sessions, for User XXXXXX, During rule state

       Batch-Normal.

Do you have any idea why I am getting this error or any suggestion.

This is the code I am using ---

LIBNAME _ALL_ CLEAR;

LIBNAME X "/dhana/inputfile/";


LIBNAME TD TERADATA TDPID=TERADT USER=%SYSGET(USER) PASSWORD=%SYSGET(PASSWD) SCHEMA=XXXXXXX MULTI_DATASRC_OPT=IN_CLAUSE;

PROC SQL;

SELECT

A.VAR1,A.VAR2,A.VAR3,B.VAR1,B.VAR2,B.VAR3,B.VAR4,B.VAR5,B.VAR6,B.VAR7,C.VAR1,C.VAR2,C.VAR3,C.VAR4,D.VAR1,D.VAR2,E.VAR1,F.VAR1,G.VAR1

FROM

TD.TABLE1 B

LEFT JOIN


TD.TABLE2 A

ON A.ACCT_ID=B.ACCT_ID AND
A.EFF_DT=B.EFF_DT

LEFT JOIN
TD.TABLE3 C

ON A.ACCT_ID=C.ACCT_ID AND
A.EFF_DT=C.EFF_DT

LEFT JOIN
TD.TABLE4 D

ON A.ACCT_ID=D.ACCT_ID AND
A.EFF_DT=D.EFF_DT

LEFT JOIN
TD.TABLE5 E

ON A.ACCT_ID=E.ACCT_ID AND
A.EFF_DT=E.EFF_DT

LEFT JOIN
TD.TABLE6 F

ON A.ACCT_ID=F.ACCT_ID AND
A.EFF_DT=F.EFF_DT

LEFT JOIN
TD.TABLE7 G

ON A.ACCT_ID=G.ACCT_ID AND
A.EFF_DT=G.EFF_DT

WHERE

B.EFF_DT='2011-11-30'
AND C.VAR6='D'
AND B.ACCT_NBR IN(SELECT ACCT_NBR FROM X.CCLDRIVER)

ORDER BY B.ACCT_NBR, B.ACCT_ID

;

QUIT;


LIBNAME _ALL_ CLEAR;

Thanks

Dhanasekaran R

SAS Employee
Posts: 104

Re: Subseting Teradata table with SAS dataset

1.  Yikes!  As Patrick noted above, there is a limit of 4,500 unique values in the WHERE clause produced by MULTI_DATASRC_OPT.  For the fastest performance, and complete results, you'll need to upload the SAS table (ACCT_NBR column only, of course) to Teradata and do the join there.  Take a look at the BULKLOAD=YES option for the libname statement in the Maximizing Teradata Load Performance article in the online docs - this can make uploading the SAS table many times faster that an conventional row-by-row load.

2. Because of the way the second query is written, SAS must fetch the data from the Teradata tables into SAS to complete the processing.  The LIBNAME engine may be multi-threading and creating individual connections for each table you are reading.  You are reading 7 Teradata tables, the the Teradata error is complaining about 7 concurrent sessions. 

Once again, uploading the ACCT_NBR column from the SAS table to Teradata before you execute this query will probably resolve the issue.  The LIBNAME engine should then be able to push the entire process into Teradata. 

You can get  a better handle on what's actually happening by using the SAS options SASTRACE, SASTRACELOC and NOTSUFFIX.  Try submitting this OPTIONS statement before the SQL:

options sastrace=',,,d' sastraceloc=saslog nostsuffix;

Then, after submitting the SQL code, you can see in the SAS log exactly what SQL was passed to the Teradata instance. 

Frequent Contributor
Posts: 75

Re: Subseting Teradata table with SAS dataset

Thank You SASJedi...! I will try this option and will come back if I run into any issues...!

Thanks

Dhanasekaran R

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 4690 views
  • 5 likes
  • 3 in conversation