I need some help figuring out if what I want to accomplish is possible. I pull base data from our system/databases in Snowflake using sql queries. The result tables are saved in Snowflake. Then I pull data (saved in Snowflake tables) into SAS EG by connecting to Snowflake through libnames. Then I analyze data in SAS EG and create a final report. For some of our reports there is a lot of back and forth between Snowflake cloud (Snowflake app, not in SAS EG) and SAS EG.
Is there anyway, I can create a procedure/macro (or something) in snowflake for each sql query that is in Snowflake cloud and call that in SAS EG? This way, I do not have to go back and forth between Snowflake and SAS EG.
Thank you in advance!
You can use SAS code to execute queries in Snowflake. If you already have a LIBREF defined pointing to Snowflake database then that should be all you need.
You can use PROC SQL to run queries. First make the connection by using the CONNECT USING statement. Then you can use the EXECUTE statement to run a command that does not return any output. Or the FROM CONNECTION TO clause for one that does.
So let's assume you defined the libref SNOW pointing to Snowflake
proc sql;
connect using snow;
execute by snow (...some Snowflake command...);
create table want as
select * from connection to snow
(... some Snowflake query ...)
;
quit;
You can use the front end tool Enterprise Guide to create and run the SAS code.
Thank you so much for your reply. I feel like I am almost there but yet! Below is my code.
--------------------- Code -----------------------------
79
76
ERROR 79-322: Expecting a (.
ERROR 76-322: Syntax error, statement will be ignored.
---------------------------------------------------------------------
Thank you
Remove the partial EXECUTE statement, that is what is missing the ().
It does not look like you are trying to execute anything so you don't need that statement.
Note that everything inside the () after the FROM CONNECTION TO xx clause has to be valid Snowflake SQL syntax.
That worked great! Thank you!
Now, I am going to try a bit complicated query. This query filters records using member first name, last name and date of birth under 'where' section in proc sql. In Snowflake names and date of birth are set as below on the top of the query
---------------------------------------------
SET fname = 'ABC';
SET lname = 'DEFG';
SET BMTH= '9999-12-31';
--------------------------------------------------------
Why? Are you trying to do code generation?
You can use SAS's macro language to do code generation.
%let fname = 'ABC';
%let lname = 'DEFG';
%let BMTH= '9999-12-31';
....
proc sql ;
....
where first_name=&fname
and last_name=&lname
and DOB = &BMTH
....
Thank you for all your help! When I tried to run a Snowflake query that pull data from one table, it ran quickly but when I tried q query that pulls data from 5 tables in Snowflake, it kept running. Finally, I stopped it after 5 hours. When I run the same query directly in Snowflake, it runs in under a minute. So, I would definitely be using your method for queries that use one table but not for queries that use multiple tables. I was hoping that it would work for complicated queries as it would save a lot of manually going back and forth between Snowflake and SAS.
Thank you!
@SASMom2 wrote:
Thank you for all your help! When I tried to run a Snowflake query that pull data from one table, it ran quickly but when I tried q query that pulls data from 5 tables in Snowflake, it kept running. Finally, I stopped it after 5 hours. When I run the same query directly in Snowflake, it runs in under a minute. So, I would definitely be using your method for queries that use one table but not for queries that use multiple tables. I was hoping that it would work for complicated queries as it would save a lot of manually going back and forth between Snowflake and SAS.
Thank you!
Without an example I can only guess what you did. But here are some possibilities:
You somehow asked SAS to execute the joins instead of giving the whole query to Snowflake to run. If you ended up doing that then SAS would probably try to pull the whole table over and then join them which could take a very long time. For example do NOT do something like this:
select * from work.mytable a left join snow.dbtable b
on a.id = b.id
;
Instead either upload the list SAS dataset into Snowflake then then perform the join completely in Snowflake.
The sizes of the result sets was vastly different between the two queries and so you are seeing how slow your connection to Snowflake is.
There are options you can turn on to have SAS write the SQL it sends to Snowflake into the SAS log so you can see exactly what SAS ended up asking to be run.
https://documentation.sas.com/doc/en/workbenchcdc/v_001/vwbacdata/n0sa0yux9sqf90n1ttzxhko2cit4.htm
You can then try running that code in Snowflake and time how long it takes.
I did exactly what you said not to do. Can you please clarify below.
"Instead either upload the list SAS dataset into Snowflake then then perform the join completely in Snowflake."
It is from your message.
Thank you
Please provide the query (and log) so we can help analyze it, ideally with using the options in the link from @Tom.
Below is my query. Please note that I have changed the names of fields/tables/schema/db, They are all made up names but the logic and layout is same. When I run it, it does not go past the %include statement. It keeps running.. there is no log after that. Hope it helps. Thank you!
------------------------- query --------------------------------------------------------------------
Dive into keynotes, announcements and breakthroughs on demand.
Explore Now →Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.