Solved: transfer SAS data to Teradata table

LMW5 · Posted 08-28-2019 01:58 PM

I used the following SAS code to create Teradata table, but I couldn’t get the same number of rows (1352000 rows) of Teradata table as SAS dataset (6784263 obs). Could you give me any suggestions? Thanks.

SAS code:

libname teraODH odbc noprompt="driver=Teradata Database ODBC Driver 16.20;
DBCname=tdprod1.ccf.org; MechanismName=LDAP;
username=&loginid; password=&mypw; database=xxxxx";

PROC SQL;
CREATE TABLE teraODH.LOC_DEV AS
SELECT * FROM LOC_V;
QUIT;

hashman · Posted 09-03-2019 06:25 PM

@LMW5:

If you can access Teradata via SAS, you do have a SAS/Access engine. But there are two engines: SAS/Access Interface to ODBC and SAS/Access Interface to Teradata. From what you've shown and said, it looks like you have the former and not the latter. Quoting from the SAS documentation:

ODBC is an established protocol that facilitates communication between a DBMS and an application that
complies with the ODBC standard. Earlier, it was explained that the SAS/ACCESS Interface to Teradata
is a SAS engine. In the SAS System, the SAS/ACCESS Interface to ODBC is also implemented as a
SAS engine.

What is the difference between the two engines? One difference is that the SAS/ACCESS Interface to
Teradata engine communicates directly with the Teradata DBMS, calling the Teradata CLIv2 interface. In
contrast, the SAS/ACCESS Interface to ODBC engine communicates indirectly with the Teradata DBMS via
the Teradata ODBC driver. This added layer accounts for some differences in capabilities and performance
between the products.

If you want to know more, read here:

http://support.sas.com/resources/papers/teradata.pdf

Kind regards

Paul D.

View solution in original post

SASKiwi · Posted 08-28-2019 09:02 PM

Please post your SAS log including any notes and errors.

hashman · Posted 08-28-2019 10:32 PM

@LMW5 :

Do you perchance have observations marked for deletion in LOC_V? Run proc CONTENTS against it and look at the entry "Deleted Observations". If it doesn't say 0, you may have got your culprit, as the observations marked for deletion are skipped when SAS reads the file.

On an unrelated note, it's a good idea to use the FASTLOAD=YES data set option with the Teradata table name you're loading. In your case, it would look like:

PROC SQL;
CREATE TABLE teraODH.LOC_DEV (FASTLOAD=YES) AS
SELECT * FROM LOC_V;
QUIT;

From what I've experienced, it can reduce the loading time by an order of magnitude or more.

Kind regards

Paul D.

LMW5 · Posted 09-01-2019 06:51 PM

I tried to add (FASTLOAD=YES) to my code and ran it, but it gave me error info. Thanks anyway.

hashman · Posted 09-01-2019 08:50 PM

@LMW5

Most likely because you connect to TD via the SAS/Access to ODBC which doesn't support the FASTLOAD capability.

LMW5 · Posted 09-03-2019 04:59 PM

Sorry, I am confused. I need or don't need SAS/ACCESS for using 'FASTLOAD' code? We don't have SAS/ACCESS.

hashman · Posted 09-03-2019 06:25 PM

@LMW5:

If you can access Teradata via SAS, you do have a SAS/Access engine. But there are two engines: SAS/Access Interface to ODBC and SAS/Access Interface to Teradata. From what you've shown and said, it looks like you have the former and not the latter. Quoting from the SAS documentation:

ODBC is an established protocol that facilitates communication between a DBMS and an application that
complies with the ODBC standard. Earlier, it was explained that the SAS/ACCESS Interface to Teradata
is a SAS engine. In the SAS System, the SAS/ACCESS Interface to ODBC is also implemented as a
SAS engine.

What is the difference between the two engines? One difference is that the SAS/ACCESS Interface to
Teradata engine communicates directly with the Teradata DBMS, calling the Teradata CLIv2 interface. In
contrast, the SAS/ACCESS Interface to ODBC engine communicates indirectly with the Teradata DBMS via
the Teradata ODBC driver. This added layer accounts for some differences in capabilities and performance
between the products.

If you want to know more, read here:

http://support.sas.com/resources/papers/teradata.pdf

Kind regards

Paul D.

LMW5 · Posted 09-04-2019 02:00 PM

Thank you so much. This is exactly what I want to know.

LMW5 · Posted 09-10-2019 12:52 PM

The following is SAS code that I used to access Teradata with Linux SAS.

libname teraODH odbc noprompt="driver=Teradata Database ODBC Driver 16.20;
DBCname=tdprod1.ccf.org; MechanismName=LDAP;
username=&loginid; password=&mypw; database=DL_NEO";

Now I use Enterprise SAS with SAS studio, which includes SAS/ACCESS interface to Teradata, and the server also changed, so the code doesn't work. What should I change for it? Thanks

ChrisNZ · Posted 08-28-2019 10:38 PM

proc append would be faster and give more information in the log than proc sql.

High-Performance SAS Coding - Third Edition

JBailey · Posted 08-29-2019 09:52 AM

Hi @LMW5

The table you are creating, LOC_DEV, will be defined as a SET table (CREATE SET TABLE...). SET tables cannot contain duplicate rows. The Teradata default is to create a SET table. Multiset tables allow duplicate rows in a table (CREATE MULTISET TABLE...). Let's see if this is your issue.

You can tell if your data contains duplicate rows by issuing a query similar to:

-- Sample code is included below. You can try this technique on your SAS
-- dataset and see if the number matches what is being INSERTed into Teradata.
-- This returns 10 from my SAS dataset even though there are 20 obs in it.
proc sql;
   select count(*) 
	    from (select distinct * from work.dups);
quit;

Other examples of seeing if there is a duplicate problem. Try this sample code in your environment:

-- In this example I have set DBCOMMIT=1 to show the problem with a small amount 
-- of data. PLEASE Don't do this for your large amount of data.
--
libname tera odbc dsn=tera16_DSN user=sasxjb password=mypasswd1 dbcommit=1;

data work.dups;
   input a b $20.;
cards;
1 ONE
1 oneone
2 TWO
2 twotwo
3 THREE
3 threethree
4 FOUR
4 fourfour
5 FIVE
5 fivefive
1 ONE
1 oneone
2 TWO
2 twotwo
3 THREE
3 threethree
4 FOUR
4 fourfour
5 FIVE
5 fivefive
run;

proc sql;
   create table tera.dups1 as 
      select * 
	    from dups;
quit;

-- PROC append produces a cleaner log.
proc append base=tera.dups2 data=dups;
run;

proc sql;
   select count(*) from tera.dups1;
   select count(*) from tera.dups2;
run;

You will see that only 10 rows are inserted into the table (there are 20 OBS in the SAS data set). I included the DBCOMMIT=1 so that the code will show the behavior.

If this is your problem you will see an error message similar to this one:

ERROR: CLI execute error: [Teradata][ODBC Teradata Driver][Teradata Database](-2802)Duplicate row
       error in sasxjb.dups1.
WARNING: File deletion failed for TERA.dups1.DATA.

Back to MULTISET tables - the following code should INSERT all the data into the table.

proc append base=tera.dups3 (pre_table_opts='MULTISET') data=dups;
run;

proc sql;
   select count(*) from tera.dups3;
run;

If your data doesn't have duplicates, then there may be a space allocation problem when you create your table. Look for different error messages in the code.

About FASTLOAD=. It is a great idea to use FASTLOAD when the data does not contain duplicates. If the data contains duplicates, MULTILOAD=yes is your best bet. Unfortunately, this example is using SAS/ACCESS Interface to ODBC and TPT loading capabilities aren't available.

Good luck!
Jeff

LMW5 · Posted 09-01-2019 07:02 PM

My dataset doesn't have duplicates, but it is huge, say, 600 million rows. When I ran my code, I got different rows on Teradata table each time. when I created Teradata table with small data, I got correct rows. Is there limit setting? thanks

Patrick · Posted 09-01-2019 11:55 PM

@LMW5

I find it a bit hard to digest that you're just loosing rows without any warning. Questions

- What does the SAS log tell you about rows in source and rows loaded to target?

- Are there any warnings or errors in the log?

- How do you determine the row counts for source and before and after insert in target? Are you issues select count(*) against the tables or are you doing something else?

Also: Use libname option DBCOMMIT=0 to ideally only have a single commit once all the data is loaded (=getting an "all or nothing" outcome).

https://go.documentation.sas.com/?docsetId=acreldb&docsetTarget=p00lgy3xwh61b8n16kffwq3veagu.htm&doc...

Tom · Posted 09-01-2019 10:39 PM

A normal table in Teradata does not allow duplicate rows.
Teradata also normally ignores case for character strings.

If I run this code in SAS there are only 10 distinct rows found. So Teradata is doing what you asked it to do.

select count(*) from 
(select distinct a,upcase(b) as b
 from dups)
;

Either design your table so that it is a multiset table OR the character variables are case sensitive.

Or just add an extra variable, like a row counter, so that the rows are unique.

transfer SAS data to Teradata table

Re: transfer SAS data to Teradata table

Re: transfer SAS data to Teradata table

Re: transfer SAS data to Teradata table

Re: transfer SAS data to Teradata table

Re: transfer SAS data to Teradata table

Re: transfer SAS data to Teradata table

Re: transfer SAS data to Teradata table

Re: transfer SAS data to Teradata table

Re: transfer SAS data to Teradata table

Re: transfer SAS data to Teradata table

Re: transfer SAS data to Teradata table

Re: transfer SAS data to Teradata table

Re: transfer SAS data to Teradata table

Re: transfer SAS data to Teradata table

SAS Innovate 2025: Call for Content

Classroom Training Available!