BookmarkSubscribeRSS Feed
filippo_kow
Obsidian | Level 7

Hi guys,

 

I think I have a problem with pass-through facility to Snowflake.

 

I have a query that runs ca. 2 minutes directly in Snowflake (it returns 20 milion records). Then, I would like to run this query from SAS. I am doing it in the following way:

 

proc sql noerrorstop;
   connect to snow as conn(server='XXX.snowflakecomputing.com' 
      user='my_user' password='my_pass' database='some_database' schema='some_schema' warehouse='some_warehouse');

create table result as
select * from connection to conn(
  select * from schema.table
);

	disconnect from conn;
quit;

But unfortunately this query runs a lot of time (3 hours).

I was expecting that using pass-through it should take more or less the same time that it runs directly on Snowflake (maybe a little bit longer, because it has to download the data from Snowflake, but it should be much time-consuming).

 

Can someone suggest me what am I doing wrong?

 

 Thanks in advance!
 Filip

 

 

6 REPLIES 6
LinusH
Tourmaline | Level 20

One difference is that you are sending the result set back to SAS.

How is your SAS session "locted" compared with your SF instance?

To be sure you have the same saettings for SF, try to create the result set in SF instead (using an EXECUTE block instead) and see how that performs.

Data never sleeps
filippo_kow
Obsidian | Level 7

 Hi @LinusH ,

 

Thanks for the reply!

 

The problem is that I need to download the data because I need to use them in my process. But I don't think that downloading the data takes so much time... Do you have any idea how can I compare settings of SAS session versus Snowflake?

 

Also, I am trying to use execute statement, but since I don't have 'create table' privilege, I am trying to create temporary table like this:

 


	proc sql noprint;
	   connect to snow as conn(server="&server." 
	      user="&user." password="&password." database="&database." schema="&schema." warehouse="&warehouse.");

		execute(create temp table test as select * from schema.table) by conn;

		create table sas_test as select * from connection to conn (SELECT * FROM test);

		disconnect from conn;
	quit;

But unfortunately I gen an error that TEST table doesn't exist, so it looks like it is deleted immediately after "EXECUTE" statement is run (although I thought this temporary table exists for the whole session, I mean until I run "disconnect from conn").

 

Thanks in advance for any help!

 Filip 

 

 

Quentin
Super User

You might try playing with the readbuff= option.  If the issue is slow transfer across a wire, sometimes playing with readbuff= can help.  But definitely you will pay a performance hit for pulling data from snowflake (or anywhere) to SAS if you your SAS environment is not 'close' to the snowflake environment.  You may want to partner with your SAS admin to help test/optimize the connection.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
SASKiwi
PROC Star

Try a simple row count test in both Snowflake and SAS. If the times are similar that will indicate whether it is the time to download your result set to SAS is the problem or not. Here is how to do this:

proc sql noerrorstop;
   connect to snow as conn(server='XXX.snowflakecomputing.com' 
      user='my_user' password='my_pass' database='some_database' schema='some_schema' warehouse='some_warehouse');

select * from connection to conn(
  select count(*) as Row_Count from schema.table
);
	disconnect from conn;
quit;
LinusH
Tourmaline | Level 20
I thought you could compare the time for table creation between the SF interfacr, and using SAS/ACCESS (and not try to reuse the table in this test scenario).
Do the ceaton time differ?
When it comes to "your process" - maybe som steps can be performed in Snowflake, minmizing the need to download the data?
20'' records doesn't sound much, how wide is the table?
Maybe you should talk to your network people to figure out a way to make the doenload faster?
Data never sleeps
Sajid01
Meteorite | Level 14

As can be seen from your code, it downloads the entire table (schema.table) from the cloud. From your post it has 20 Million records.
This download depends not only on the network speed but also the constraints at the originating server and your infrastructure. 
Based upon my past experience , in my opinion this is not a SAS issue and the relevant teams must be on boarded to  resolve this issue. As you are the impacted party, you may have to lead the effort.
As a test try downloading the same table through other means -such as scp/sftp/https whatever that is permissible in your environment. Involve the network team and the snowflake resources.
The above test is likely to involve large unusual data transfer. So the IT/Network  Security or equivalent teams should also be kept in loop.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2695 views
  • 2 likes
  • 5 in conversation