SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Slow performance of ODBC connection to SNOWFLAKE

Reply
Occasional Contributor
Posts: 5

Slow performance of ODBC connection to SNOWFLAKE

We are setting up a new SAS FAW environment that is connecting to Snowflake (ODBC) and S3 as our data sources.

 

Opening a Snowflake table in SAS Enterprise Guide 7.15 takes a really long time (5-16 hours) for medium sized tables, Character variable length in Snowflake seems to be one of the reasons, being this: VARCHAR(16777216) the default length for character variables in Snowflake

We have tried creating a SAS view, which solves the speed problem, but it demands a manually intensive process to determine the right length of each character variable and it only works after the Snowflake table has been completely scanned which could take a long time, or simply does not finish.

Are there any other approaches or configuration changes to the ODBC parameters that could help? 

Is somebody else facing the same issues when connecting to Snowflake using ODBC?

 

 

 

 

Super User
Posts: 5,849

Re: Slow performance of ODBC connection to SNOWFLAKE

The post is created now @JBailey.
Data never sleeps
SAS Employee
Posts: 271

Re: Slow performance of ODBC connection to SNOWFLAKE

Thanks @LinusH

SAS Employee
Posts: 271

Re: Slow performance of ODBC connection to SNOWFLAKE

[ Edited ]

Hi @curaloco,

 

TL;DR - best practice is to set the lengths when creating variable length character columns.

 

The problem you are experiencing is likely caused by not putting a length on VARCHAR column definition. The EG issue is separate but exacerbated by the text column lengths expanding. It is not a good idea to open an entire table in an EG viewer because it is going to read the entire thing into SAS. At a minimum it is best to limit the number of rows displayed in EG (Tools --> Options --> Query --> Number of rows to process in preview results window).

 

It is important to understand that these issues are not unique to Snowflake it can happen with many DBMSs.

 

Here is an example showing the character length issue:

 

libname snowdmax odbc dsn='snowflake_DSN' schema=PUBLIC  
                      user=myuser pw=passwd123
                      dbcommit=0 autocommit=no 
                      readbuff=100 insertbuff=100
                      dbmax_text=10;

/* This code can be used to spit out the ODBC driver options */ libname prompt odbc prompt=yes; %put %superq(sysdbmsg); /* notice I am not setting the lengths here */ data snow.cars_dbtype (dbtype=(make='string', model='varchar', type='varchar',origin='text')); set sashelp.cars; run;

Let's see what the columns look like.

 

columns_32k.jpg

 

Notice: 32K columns. This is not good. 

 

Let's create a SAS data set from the Snowflake table and see if DBMAX_TEXT= helps.

 

data work.snow_cars;
   set snow.cars_dbtype;
run;

/* Lets compare it to one with length information */

data snow.good_cars;
set sashelp.cars;
run;

data work.good_cars;
set snow.good_cars;
run;

 

Here is the result... Notice the size difference... DBMAX_TEXT= didn't help. It only works on BLOBs. I think this is a problem and am going to try to do something about it. Some ODBC drivers will let you limit the length of text strings. Unfortunately, the Snowflake ODBC driver doesn't appear to be one of these. I will ask them about it.

 

Back to the size...

 

snow_32k.jpg

It's a huge difference. This problem will cause performance issues when reading and writing. It also causes one of the sneakiest SAS issues I have ever seen. It happened with Hadoop but this is the same issue.

 

A call came into SAS Tech Support concerned about SASWORK running out of space. It took a while to discover that expanding character column lengths was causing the problem. Initially, the customer added space to SASWORK. It didn't address the problem although it did postpone it for a while.

 

So, the best practice is to set the lengths when creating variable length character columns. 

Ask a Question
Discussion stats
  • 3 replies
  • 299 views
  • 2 likes
  • 3 in conversation