BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Lanei
Calcite | Level 5

Hi,

 

We have migration on going from Netezza to Vertica. I'm using SAS DIS and we have converted all the Netezza tables to Vertica tables and most of it works fine but I have two problems which I need to figure out. We are using SAS/ACCESS to Vertica.

 

Some jobs have bigger datasets to upload to database and those had BULKLOAD = YES with Netezza. As I understood, Vertica doesn't support BULKLOAD so I had to remove that. I also changed Table Loader (Load style: Append to Existing) Technique from 'proc append' to 'insert sql' because with append I got errors:

WARNING: Variable XX has different lengths on BASE and DATA files (BASE 2560 DATA 40)

from all the character columns.

 

Problem 1 is that when I had to remove BULKLOAD = YES, execution times increased from few minutes to many hours. Is there any 'bulkload' option which works with Vertica?

 

And problem 2 may be because of problem 1 (?). After job has executed 8 hours, I get error:

ERROR: Error attempting to CREATE a DBMS table. ERROR: CLI execute error: ROLLBACK 4213: Object "table_name"  already exists.

I can't repeat that error with limited datasets so this occurs only after long runs. I also checked that all the work table names match in log so it's not (shouldn't be) because of table references. Table is not empty before this append and it should not try to create anything.

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

I am not familiar with Vertica, but I can see from the documentation that there is no bulk load facility. One other way to speed things up when writing to a DBMS table is to set the libname or dataset option INSERTBUFF to a high value. You probably want to experiment with the number, according to the documentation the default value depends on the row length, so the optimal value may depend on the table being inserted.

 

What happens with PROC APPEND? Are all the character columns in your SAS input really 2560 characters long? And the length of the corresponding values in Vertica only 40 characters? You should probably check if the conversion from Netezza to Vertica went well, then.

View solution in original post

3 REPLIES 3
s_lassen
Meteorite | Level 14

I am not familiar with Vertica, but I can see from the documentation that there is no bulk load facility. One other way to speed things up when writing to a DBMS table is to set the libname or dataset option INSERTBUFF to a high value. You probably want to experiment with the number, according to the documentation the default value depends on the row length, so the optimal value may depend on the table being inserted.

 

What happens with PROC APPEND? Are all the character columns in your SAS input really 2560 characters long? And the length of the corresponding values in Vertica only 40 characters? You should probably check if the conversion from Netezza to Vertica went well, then.

Lanei
Calcite | Level 5
Hi,
That INSERTBUFF looks very promising! I think one job which was earlier 4hours takes now only less than 20minutes! I have to check few times before I believe this 😄 Thank you so much!

That PROC APPEND problem I don't understand. It has increased also and gives me no
WARNING: Variable XX has different lengths on BASE and DATA files (BASE 32767 DATA 40).
That's the maximum (I tried few times). And all the chars get same BASE, even the DATA varies (BASE 32767 DATA 16) or (BASE 32767 DATA 32).
This job reads SAS table and takes summary and then loads to the Vertica table. Extract before Table Loader changes column length from 20 to 40 (with variable XX). I tried to check the character lengths, but everything seemed ok..?
s_lassen
Meteorite | Level 14

Oh, sorry, I misread your original post. When the BASE variable is LONGER than the DATA variable, there is really no problem with append. You can just use the FORCE option in APPEND, and it should work. You will still get the warnings, though.

 

As I said, I have no experience with Vertica. But if I got similar messages from another DBMS (e.g. SQL Server), I would suspect that character variables had not been optimally created in the conversion - that what used to be standard character variables (e.g. VARCHAR(40)) had been converted to some sort of long text variables in the new database. And these may work in some cases, but they may slow down performance in other cases, and sometimes not work at all for your purposes. Depends on the DBMS. But maybe Vertica has only long text fields?

 

The numbers in the warnings may have changed because the DBMAX_TEXT option for the Vertica libname has been set to a new value.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1201 views
  • 1 like
  • 2 in conversation