BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dhana
Fluorite | Level 6

Happy New Year SAS Communities...!

Here is my problem question

I have ~90 Million record in my master table (Teradata table) and I have another SAS dataset with 1 Million record.
Both of these table contains account numbers.

I want to query the teradata table based on account number using the account numbers from SAS dataset.

I am using the following code to do that

LIBNAME X "/dhana/inputfile";

PROC SQL;

RESET INOBS=MAX OUTOBS=MAX LOOPS=MAX NOFLOW NOFEEDBACK NOPROMPT NONUMBER ;

CONNECT TO TERADATA(TDPID=TERADT USER=%SYSGET(USER) PASSWORD=%SYSGET(PASSWD));

CREATE TABLE X.ALL_ACCTS AS

SELECT * FROM CONNECTION TO TERADATA(

SELECT VAR1, VAR2, VAR3, VAR4, VAR5, VAR5

FROM TABLE1

WHERE ACCT_NBR IN( SELECT ACCT_NBR FROM X.SAS_DATASET)

)
;

QUIT;

It looks like I can't use the SAS dataset in the Teradata SQL pass through query.
I am getting the following error message - Teradata prepare: Database 'X' does not exist.
Is there any way I can subset the teradata table with the account numbers from my SAS dataset.

Thanks

Dhanas

1 ACCEPTED SOLUTION

Accepted Solutions
SASJedi
SAS Super FREQ

To simplify matters, use implicit passthrough with the MULTI_DATASRC_OPT= option on the LIBNAME statement, and the DBMASTER= dataset option on the Teradata table reference during the join. 

The code will look SOMETHING like this:

 
LIBNAME X "C:\MY_SASDATA";
LIBNAME TD teradata  TDPID=TERADT USER="%SYSGET(USER)" 
   PASSWORD="%SYSGET(PASSWD)" multi_datasrc_opt=in_clause;

PROC SQL;
CREATE TABLE X.ALL_ACCTS AS
   SELECT VAR1, VAR2, VAR3, VAR4, VAR5, VAR5
      FROM TD.TABLE1(dbmaster=yes) AS TD
         , X.SAS_DATASET AS SAS
      WHERE TD.ACCT_NBR =SAS.ACCT_NBR 
;
QUIT;
Check out my Jedi SAS Tricks for SAS Users

View solution in original post

5 REPLIES 5
SASJedi
SAS Super FREQ

To simplify matters, use implicit passthrough with the MULTI_DATASRC_OPT= option on the LIBNAME statement, and the DBMASTER= dataset option on the Teradata table reference during the join. 

The code will look SOMETHING like this:

 
LIBNAME X "C:\MY_SASDATA";
LIBNAME TD teradata  TDPID=TERADT USER="%SYSGET(USER)" 
   PASSWORD="%SYSGET(PASSWD)" multi_datasrc_opt=in_clause;

PROC SQL;
CREATE TABLE X.ALL_ACCTS AS
   SELECT VAR1, VAR2, VAR3, VAR4, VAR5, VAR5
      FROM TD.TABLE1(dbmaster=yes) AS TD
         , X.SAS_DATASET AS SAS
      WHERE TD.ACCT_NBR =SAS.ACCT_NBR 
;
QUIT;
Check out my Jedi SAS Tricks for SAS Users
Patrick
Opal | Level 21

I really like SASJedi's approach. The only possible issue I see with it is:

The SAS Doc for MULTI_DATASRC_OPT states "Currently, the IN clause has a limit of 4,500 unique values".

http://support.sas.com/documentation/cdl/en/acreldb/63647/HTML/default/viewer.htm#a002205740.htm

An alternative approach could be to load the SAS dataset into a temporary table in Teradata and then join these 2 tables.

http://support.sas.com/documentation/cdl/en/acreldb/63647/HTML/default/viewer.htm#a002677096.htm

dhana
Fluorite | Level 6

Thanks SASJedi...! 

This is really an excellent option and answer. I tried this option it worked perfectly. But I have faced 2 problems.

1. When I try to merge one teradata table with SAS dataset , I have found an interesting thing. Though the SAS dataset contains approximately 1 Million record and all the account numbers are

     present in the teradata table, the query returned only ~15718 records.

Can you help me why I am getting very less record returned from the query.

2.  When I try to merge 7 teradata tables along with the SAS dataset then I got the following error.

ERROR: Teradata connection: TWM Workload violation: Limit of 7 concurrent sessions, for User XXXXXX, During rule state

       Batch-Normal.

Do you have any idea why I am getting this error or any suggestion.

This is the code I am using ---

LIBNAME _ALL_ CLEAR;

LIBNAME X "/dhana/inputfile/";


LIBNAME TD TERADATA TDPID=TERADT USER=%SYSGET(USER) PASSWORD=%SYSGET(PASSWD) SCHEMA=XXXXXXX MULTI_DATASRC_OPT=IN_CLAUSE;

PROC SQL;

SELECT

A.VAR1,A.VAR2,A.VAR3,B.VAR1,B.VAR2,B.VAR3,B.VAR4,B.VAR5,B.VAR6,B.VAR7,C.VAR1,C.VAR2,C.VAR3,C.VAR4,D.VAR1,D.VAR2,E.VAR1,F.VAR1,G.VAR1

FROM

TD.TABLE1 B

LEFT JOIN


TD.TABLE2 A

ON A.ACCT_ID=B.ACCT_ID AND
A.EFF_DT=B.EFF_DT

LEFT JOIN
TD.TABLE3 C

ON A.ACCT_ID=C.ACCT_ID AND
A.EFF_DT=C.EFF_DT

LEFT JOIN
TD.TABLE4 D

ON A.ACCT_ID=D.ACCT_ID AND
A.EFF_DT=D.EFF_DT

LEFT JOIN
TD.TABLE5 E

ON A.ACCT_ID=E.ACCT_ID AND
A.EFF_DT=E.EFF_DT

LEFT JOIN
TD.TABLE6 F

ON A.ACCT_ID=F.ACCT_ID AND
A.EFF_DT=F.EFF_DT

LEFT JOIN
TD.TABLE7 G

ON A.ACCT_ID=G.ACCT_ID AND
A.EFF_DT=G.EFF_DT

WHERE

B.EFF_DT='2011-11-30'
AND C.VAR6='D'
AND B.ACCT_NBR IN(SELECT ACCT_NBR FROM X.CCLDRIVER)

ORDER BY B.ACCT_NBR, B.ACCT_ID

;

QUIT;


LIBNAME _ALL_ CLEAR;

Thanks

Dhanasekaran R

SASJedi
SAS Super FREQ

1.  Yikes!  As Patrick noted above, there is a limit of 4,500 unique values in the WHERE clause produced by MULTI_DATASRC_OPT.  For the fastest performance, and complete results, you'll need to upload the SAS table (ACCT_NBR column only, of course) to Teradata and do the join there.  Take a look at the BULKLOAD=YES option for the libname statement in the Maximizing Teradata Load Performance article in the online docs - this can make uploading the SAS table many times faster that an conventional row-by-row load.

2. Because of the way the second query is written, SAS must fetch the data from the Teradata tables into SAS to complete the processing.  The LIBNAME engine may be multi-threading and creating individual connections for each table you are reading.  You are reading 7 Teradata tables, the the Teradata error is complaining about 7 concurrent sessions. 

Once again, uploading the ACCT_NBR column from the SAS table to Teradata before you execute this query will probably resolve the issue.  The LIBNAME engine should then be able to push the entire process into Teradata. 

You can get  a better handle on what's actually happening by using the SAS options SASTRACE, SASTRACELOC and NOTSUFFIX.  Try submitting this OPTIONS statement before the SQL:

options sastrace=',,,d' sastraceloc=saslog nostsuffix;

Then, after submitting the SQL code, you can see in the SAS log exactly what SQL was passed to the Teradata instance. 

Check out my Jedi SAS Tricks for SAS Users
dhana
Fluorite | Level 6

Thank You SASJedi...! I will try this option and will come back if I run into any issues...!

Thanks

Dhanasekaran R

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 11882 views
  • 5 likes
  • 3 in conversation