Hi,
I am trying to extract data from snowflake into SAS. I am not using SAS Viya. What options can I use for efficient data extraction?
I tried extracting same table(same number of observations and same variables) from oracle and snowflake into SAS.
The table created from Oracle took up 4.7 GB and completed in approximately 4 mins.
The table created from Snowflake took up 1221 GB and completely in approximately 3 Hour 20 mins.
Can I use temporary table support for SAS access to overcome this issue ?
Appreciate any help on this.
Thanks,
What issue?
You were able to download over 250 times as much data in only 50 times as much time.
Thanks for the reply.
Issue here is the size of the data.
I extracted same data (billing table) from oracle and snowflake into SAS. Size of the data seems to be extremely huge when pulling from snowflake that it causing workspace issue.
Could it be extra spaces padded to the columns ?
So is it a good idea to create a predefined layout in SAS and then load it from snowflake ? or any configurations which can be changed to resolve this issue ?
Most likely the table(s) in SNOWFLAKE have defined the character variables differently. Compare the structures of the two SAS datasets. For example by running PROC CONTENTS.
Instead of defining them with an actual maximum length they probably defined them with a some type of undefined length variable type. Looks like the default in SNOWFLAKE is to limit them to 16,777,216 bytes which is about 500 times the maximum of 32,767 that can be used for a SAS variable.
You might have to fix the tables in SNOWFLAKE. Or make a view that redefines the variable types.
But you might be able to use the DBSASTYPE dataset option to tell SAS how to define the variables when it pulls from SNOWFLAKE. However I am not sure if this changes the length of the strings that actually transferred or just impacts what is saved to the disk when the SAS dataset is written.
Sure will do proc contents and will try out DBSASTYPE option.
Thanks a lot for the suggestion.
Hi @Sheeba
Following what @Tom has just told you about Snowflake's default String column length (16+ M), I would suggest taking your Oracle Table description, and use it to define a View on top of Snowflake table with the use of CAST /TRY_CAST functions within Snowflake
For null/blank values, you probably will need to use conditional logic.
Hope this helps
Sure. Thank you for the reply.
Will compare the Oracle vs Snowflake table.
Try setting the SAS COMPRESS option:
options compress = yes;
That will compress all blank space out of text variables on disk, and should speed up data extracting.
You might also want to try the DBTEXTMAX option to set an upper limit on the size of your character variables.
Sorry I mean't DBMAX_TEXT.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.