BookmarkSubscribeRSS Feed
Sheeba
Lapis Lazuli | Level 10

Hi,

 

I am trying to extract data from snowflake into SAS. I am not using SAS Viya. What options can I use for efficient data extraction?

I tried extracting same table(same number of observations and same variables) from oracle and snowflake into SAS. 

The table created from Oracle took up 4.7 GB and completed in approximately 4 mins.

The table created from Snowflake took up 1221 GB and completely in approximately 3 Hour 20 mins.

Can I use temporary table support for SAS access to overcome this issue ? 

 

 

Appreciate any help on this.

Thanks,

9 REPLIES 9
Tom
Super User Tom
Super User

What issue?

You were able to download over 250 times as much data in only 50 times as much time.

Sheeba
Lapis Lazuli | Level 10

Thanks for the reply.

 

Issue here is the size of the data.

I extracted same data (billing table) from oracle and snowflake into SAS. Size of the data seems to be extremely huge when pulling from snowflake that it causing workspace issue.

 

Could it be extra spaces padded to the columns ?

So is it a good idea to create a predefined layout in SAS and then load it from snowflake ? or any configurations which can be changed to resolve this issue ?

 

 

Tom
Super User Tom
Super User

Most likely the table(s) in SNOWFLAKE have defined the character variables differently.  Compare the structures of the two SAS datasets. For example by running PROC CONTENTS.

 

Instead of defining them with an actual maximum length they probably defined them with a some type of undefined length variable type.   Looks like the default in SNOWFLAKE is to limit them to 16,777,216 bytes which is about 500 times the maximum of 32,767 that can be used for a SAS variable.

 

You might have to fix the tables in SNOWFLAKE.  Or make a view that redefines the variable types.

 

But you might be able to use the DBSASTYPE dataset option to tell SAS how to define the variables when it pulls from SNOWFLAKE.  However I am not sure if this changes the length of the strings that actually transferred or just impacts what is saved to the disk when the SAS dataset is written.

 

Sheeba
Lapis Lazuli | Level 10

Sure will do proc contents and will try out DBSASTYPE option.

 

Thanks a lot  for the suggestion. 

AhmedAl_Attar
Rhodochrosite | Level 12

Hi @Sheeba 

Following what @Tom has just told you about Snowflake's default String column length (16+ M), I would suggest taking your Oracle Table description, and use it to define a View on top of Snowflake table with the use of CAST /TRY_CAST functions within Snowflake

For null/blank values, you probably will need to use conditional logic.

 

Hope this helps

Sheeba
Lapis Lazuli | Level 10

Sure. Thank you for the reply.

Will compare the Oracle vs Snowflake table.

SASKiwi
PROC Star

Try setting the SAS COMPRESS option:

options compress = yes;

That will compress all blank space out of text variables on disk, and should speed up data extracting.

 

You might also want to try the DBTEXTMAX option to set an upper limit on the size of your character variables.

Sheeba
Lapis Lazuli | Level 10
Thanks for the details.

Sure ..I will try this options .

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 808 views
  • 6 likes
  • 4 in conversation