BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
LMW5
Calcite | Level 5

I used the following SAS code to create Teradata table, but I couldn’t get the same number of rows (1352000 rows) of Teradata table as SAS dataset (6784263 obs). Could you give me any suggestions? Thanks.

SAS code:

 

libname teraODH odbc noprompt="driver=Teradata Database ODBC Driver 16.20;
DBCname=tdprod1.ccf.org; MechanismName=LDAP;
username=&loginid; password=&mypw; database=xxxxx";

 

PROC SQL;
CREATE TABLE teraODH.LOC_DEV AS
SELECT * FROM LOC_V;
QUIT;

 

1 ACCEPTED SOLUTION

Accepted Solutions
hashman
Ammonite | Level 13

@LMW5:

If you can access Teradata via SAS, you do have a SAS/Access engine. But there are two engines: SAS/Access Interface to ODBC and SAS/Access Interface to Teradata. From what you've shown and said, it looks like you have the former and not the latter. Quoting from the SAS documentation:

 

ODBC is an established protocol that facilitates communication between a DBMS and an application that
complies with the ODBC standard. Earlier, it was explained that the SAS/ACCESS Interface to Teradata
is a SAS engine. In the SAS System, the SAS/ACCESS Interface to ODBC is also implemented as a
SAS engine.


What is the difference between the two engines? One difference is that the SAS/ACCESS Interface to
Teradata engine communicates directly with the Teradata DBMS, calling the Teradata CLIv2 interface. In
contrast, the SAS/ACCESS Interface to ODBC engine communicates indirectly with the Teradata DBMS via
the Teradata ODBC driver. This added layer accounts for some differences in capabilities and performance
between the products.  

image.png

If you want to know more, read here:

 

http://support.sas.com/resources/papers/teradata.pdf

 

Kind regards

Paul D.

 

 

 

 

View solution in original post

13 REPLIES 13
SASKiwi
PROC Star

Please post your SAS log including any notes and errors.

hashman
Ammonite | Level 13

@LMW5 :

Do you perchance have observations marked for deletion in LOC_V? Run proc CONTENTS against it and look at the entry "Deleted Observations". If it doesn't say 0, you may have got your culprit, as the observations marked for deletion are skipped when SAS reads the file.

 

On an unrelated note, it's a good idea to use the FASTLOAD=YES data set option with the Teradata table name you're loading. In your case, it would look like:

PROC SQL;
CREATE TABLE teraODH.LOC_DEV (FASTLOAD=YES) AS
SELECT * FROM LOC_V;
QUIT;

From what I've experienced, it can reduce the loading time by an order of magnitude or more.

 

Kind regards

Paul D. 

LMW5
Calcite | Level 5

I tried to add (FASTLOAD=YES) to my code and ran it, but it gave me error info. Thanks anyway.

hashman
Ammonite | Level 13

@LMW5 

Most likely because you connect to TD via the SAS/Access to ODBC which doesn't support the FASTLOAD capability.

 

LMW5
Calcite | Level 5

Sorry, I am confused. I need or don't need SAS/ACCESS for using 'FASTLOAD' code? We don't have SAS/ACCESS.

hashman
Ammonite | Level 13

@LMW5:

If you can access Teradata via SAS, you do have a SAS/Access engine. But there are two engines: SAS/Access Interface to ODBC and SAS/Access Interface to Teradata. From what you've shown and said, it looks like you have the former and not the latter. Quoting from the SAS documentation:

 

ODBC is an established protocol that facilitates communication between a DBMS and an application that
complies with the ODBC standard. Earlier, it was explained that the SAS/ACCESS Interface to Teradata
is a SAS engine. In the SAS System, the SAS/ACCESS Interface to ODBC is also implemented as a
SAS engine.


What is the difference between the two engines? One difference is that the SAS/ACCESS Interface to
Teradata engine communicates directly with the Teradata DBMS, calling the Teradata CLIv2 interface. In
contrast, the SAS/ACCESS Interface to ODBC engine communicates indirectly with the Teradata DBMS via
the Teradata ODBC driver. This added layer accounts for some differences in capabilities and performance
between the products.  

image.png

If you want to know more, read here:

 

http://support.sas.com/resources/papers/teradata.pdf

 

Kind regards

Paul D.

 

 

 

 

LMW5
Calcite | Level 5

Thank you so much. This is exactly what I want to know.

LMW5
Calcite | Level 5

The following is SAS code that I used to access Teradata with Linux SAS.

 

libname teraODH odbc noprompt="driver=Teradata Database ODBC Driver 16.20;
DBCname=tdprod1.ccf.org; MechanismName=LDAP;
username=&loginid; password=&mypw; database=DL_NEO";

 

Now I use Enterprise SAS with SAS studio, which includes SAS/ACCESS interface to Teradata, and the server also changed, so the code doesn't work. What should I change for it? Thanks 

ChrisNZ
Tourmaline | Level 20

proc append would be faster and give more information in the log than proc sql.

 

JBailey
Barite | Level 11

Hi @LMW5 

 

The table you are creating, LOC_DEV, will be defined as a SET table (CREATE SET TABLE...). SET tables cannot contain duplicate rows. The Teradata default is to create a SET table. Multiset tables allow duplicate rows in a table (CREATE MULTISET TABLE...). Let's see if this is your issue. 

 

You can tell if your data contains duplicate rows by issuing a query similar to:

-- Sample code is included below. You can try this technique on your SAS
-- dataset and see if the number matches what is being INSERTed into Teradata.
-- This returns 10 from my SAS dataset even though there are 20 obs in it.
proc sql; select count(*) from (select distinct * from work.dups); quit;



Other examples of seeing if there is a duplicate problem. Try this sample code in your environment:

-- In this example I have set DBCOMMIT=1 to show the problem with a small amount 
-- of data. PLEASE Don't do this for your large amount of data.
--
libname tera odbc dsn=tera16_DSN user=sasxjb password=mypasswd1 dbcommit=1;
data work.dups; input a b $20.; cards; 1 ONE 1 oneone 2 TWO 2 twotwo 3 THREE 3 threethree 4 FOUR 4 fourfour 5 FIVE 5 fivefive 1 ONE 1 oneone 2 TWO 2 twotwo 3 THREE 3 threethree 4 FOUR 4 fourfour 5 FIVE 5 fivefive run; proc sql; create table tera.dups1 as select * from dups; quit;
-- PROC append produces a cleaner log. proc append base=tera.dups2 data=dups; run; proc sql; select count(*) from tera.dups1; select count(*) from tera.dups2; run;

You will see that only 10 rows are inserted into the table (there are 20 OBS in the SAS data set). I included the DBCOMMIT=1 so that the code will show the behavior.

 

If this is your problem you will see an error message similar to this one:

ERROR: CLI execute error: [Teradata][ODBC Teradata Driver][Teradata Database](-2802)Duplicate row
       error in sasxjb.dups1.
WARNING: File deletion failed for TERA.dups1.DATA.

Back to MULTISET tables - the following code should INSERT all the data into the table.

 

proc append base=tera.dups3 (pre_table_opts='MULTISET') data=dups;
run;

proc sql;
   select count(*) from tera.dups3;
run;

If your data doesn't have duplicates, then there may be a space allocation problem when you create your table. Look for different error messages in the code. 

 

About FASTLOAD=. It is a great idea to use FASTLOAD when the data does not contain duplicates. If the data contains duplicates, MULTILOAD=yes is your best bet. Unfortunately, this example is using SAS/ACCESS Interface to ODBC and TPT loading capabilities aren't available. 

 

Good luck!
Jeff

 

 

LMW5
Calcite | Level 5

My dataset doesn't have duplicates, but it is huge, say, 600 million rows. When I ran my code, I got different rows on Teradata table each time. when I created Teradata table with small data, I got correct rows. Is there limit setting? thanks

Patrick
Opal | Level 21

@LMW5 

I find it a bit hard to digest that you're just loosing rows without any warning. Questions

- What does the SAS log tell you about rows in source and rows loaded to target?

- Are there any warnings or errors in the log?

- How do you determine the row counts for source and before and after insert in target? Are you issues select count(*) against the tables or are you doing something else?

 

Also: Use libname option DBCOMMIT=0 to ideally only have a single commit once all the data is loaded (=getting an "all or nothing" outcome).

https://go.documentation.sas.com/?docsetId=acreldb&docsetTarget=p00lgy3xwh61b8n16kffwq3veagu.htm&doc... 

Tom
Super User Tom
Super User
  • A normal table in Teradata does not allow duplicate rows. 
  • Teradata also normally ignores case for character strings.

If I run this code in SAS there are only 10 distinct rows found. So Teradata is doing what you asked it to do.

select count(*) from 
(select distinct a,upcase(b) as b
 from dups)
;

Either design your table so that it is a multiset table OR the character variables are case sensitive.

Or just add an extra variable, like a row counter, so that the rows are unique. 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 10649 views
  • 4 likes
  • 7 in conversation