Problem statement
Variables created in temporary tables (created by SAS) appear to not be available to use in Snowflake joins. This happens whether numeric or char variables in SAS.
Following code works fine. It creates the temp table (the DBMSTEMP option in libname set up the ability to update temp tables). TMP_DATA is the LIBNAME that points to the Snowflake schema. But it doesnt reference the temp table column in the join
data work.one ;
ar_id = 3 ; name = 'Beatles' ; output ;
ar_id = 2 ; name = 'Marmalade' ; output ;
run ;
options SASTRACE=',,,d' sastraceloc=saslog;
proc append base=tmp_data.DON_TEMP_TABLE_TEST
data=work.one ;
run ;
title 'TEMP TABLE FROM APPEND' ;
proc sql ;
connect using TMP_DATA ;
select * from connection to TMP_DATA
(
describe table <<database>>.<<temp file schema>>.DON_TEMP_TABLE_TEST ;
);
select * from connection to TMP_DATA
(
select * from <<database>>.<<temp file schema>>.DON_TEMP_TABLE_TEST a
inner join <<database>>.<<table schema>>.ar_translation b
/* on a.ar_id = b.ar_id */
limit 2
);
and produces this output (some fields deliberately masked)
Now uncomment the commented out lines above (to make the SQL not do a cartesian join). This Snowflake error occurs:
ERROR: CLI prepare error: SQL compilation error: error line 1 at position 125 invalid identifier 'A.AR_ID'
SQL statement: select * from <<database>>.<<temp file schema>>.DON_TEMP_TABLE_TEST a
inner join <<database>>.<<table schema>>.ar_translation b on a.ar_id = b.ar_id.
So, only when used in a join, the variable is somehow invalid. I've tried random names not the same as the key in B but no difference. I've also uppercased everything and it still fails. Numeric and Char fail.
When I create a temp using SnowFLake to create the table, this does not happen, so I believe this is a SAS issue.
Has anyone else struck this?
I am tracking this to tech support as well.
Note that switching on the SAS trace gives no obvious usual info. Here it is (with a char variable rather than numeric used)
62 select * from connection to TMP_DATA
63 (
64 select * from <<database>>.<<temp table schema>>.DON_TEMP_TABLE_TEST a
65 inner join<<database>>.<<table schema>>.ar_translation b
66 on a.ar_local_id = b.ar_local_id
67 limit 2
68
69 );
30 2001018589 rtmdoit 32766 SQL (2)
SNOWFLAKE_8: Prepared: on connection 0 31 2001018589 rtmdoit 32766 SQL (2)
select * from<<database>>.<<temptable schema>>.DON_TEMP_TABLE_TEST a inner join<<database>>.<<table schema>>.ar_translation b on
a.ar_local_id = b.ar_local_id limit 2 32 2001018589 rtmdoit 32766 SQL (2)
33 2001018589 rtmdoit 32766 SQL (2)
ERROR: CLI prepare error: SQL compilation error: error line 1 at position 125 invalid identifier 'A.AR_LOCAL_ID'
SQL statement: select * from <<database>>.<<temp table schema>>.DON_TEMP_TABLE_TEST a inner join
<<database>>.<<table schema>>.ar_translation b on a.ar_local_id = b.ar_local_id limit 2.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
Thanks, Don
I think the issue is probably that SAS is adding quotes around the variable names when it autogenerates the SQL and that is causing Snowflake to use the actual case that they quoted name had.
Apparently you have to have the names in all uppercase for Snowflake to recognize unquoted name references.
It might help in you SAS sessions to set the VALIDVARNAME option to UPCASE and that will force SAS to make uppercase names when you make the SAS datasets.
Is SNOWFLAKE one of those databases (like Postgres) then does not handle mixed case names very well?
What happens if you define the variable name using all UPPERCASE letters?
Or use quotes around the name of the variable when referencing it?
As per the original post "I've also uppercased everything and it still fails.". I did also quote it with no change.
However, since then scouring the Snowflake pages, I found that this may well be a casing issue. I havent quite worked out what they are getting at, but it did include a simple work but highly obscure work around (which our SAS devs already have told me isnt a good fix, which as a SAS dev I knowm and I did use "workaround" quite liberally), which works. That is that Snowflake can access columns by column number, so using a.$1 = b.ar_local_id works. I dont want that as permanent solution. its to obscure. Have asked DBA around whether an option referenced in the SnowFLake can help, it appears not from a SAS session, but I cant really tell if my attempt to set that option via execute has worked. Anyway, I'll let you whether the option works.
I think that casing comes into this, I just havent worked out how, or even if I can overcome in code how I enter column names. Certainly I uppercased everything I can see.
After a lot of trial and error I tracked what needed to be in uppercase. It was the variable name in the SAS step that built the temp table
data work.one ;
AR_LOCAL_ID = '03' ; name = 'Beatles' ; output ;
AR_LOCAL_ID = '02'; name = 'Marmalade' ; output ;
run ;
Then Snow works, and doesnt care if its queries have the column name upper case or lowercase.
Sort of surprised that SAS retained the casing when I typed the code in so it reflected lowercase in Snow, and our Snow database has upper case names, and two were treated differently
The "problem" that both Snow and SAS seem to have support for case dependent variable names. SAS doesnt in general honour that. However when creating a SAS table in Snow, it does. So the variable was created as ar_local_id. But in Snow (at our site) it looks like Snow columns are all uppercase. So when SAS pushed in a column in lowercase Snow honoured that because it can, but then when joining, it couldnt find the variable as it wasnt in upper case. Writing it as upper case in the query did not help. It has to be uppercase in the SAS data step, It is clear when you see side by side describe table outputs
SAS with lower case name in datastep
SAS with upper case name in datastep
I think the issue is probably that SAS is adding quotes around the variable names when it autogenerates the SQL and that is causing Snowflake to use the actual case that they quoted name had.
Apparently you have to have the names in all uppercase for Snowflake to recognize unquoted name references.
It might help in you SAS sessions to set the VALIDVARNAME option to UPCASE and that will force SAS to make uppercase names when you make the SAS datasets.
If you do not add quotes around the names then Snowflake assumes UPPERCASE letters.
If you add quotes around the names then Snowflake tries to match those exact letters.
So you defined the names using lowercase letters in SAS. Then had SAS create a table in Snowflake from the SAS dataset. When SAS generated the SQL code to make the table it quoted all of the identifiers. So the variables are named ar_id and name instead of AR_ID and NAME. Which means in your explicit Snowflake code you need to use quotes around the names.
on a."ar_id" = b.ar_id
Thats awesome thanks for digging that up. I have however used quotes around everything at one point of testing, so a bit confused why that didnt work. Anyway, I'll try that again and the VALIDVARNAME option. Thanks again, will update here on this
It you made the other table normally without adding quotes around the variable names then they would be in UPPERCASE. So if you tried.
on a."ar_id" = b."ar_id"
Then it will find the variable SAS created but not the other one.
So you would have had to use uppercase letters inside the quotes.
on a."ar_id" = b."AR_ID"
SAS stores the variable names in mixed case, but ignores the case when looking for the variable.
SNOWFLAKE also stores the names in mixed case, but it does not ignore the case when looking for the variable. But if you create the variable without quoting Snowflake stores the name in uppercase.
When you use a variable name without quoting SNOWFLAKE upcases the name and then checks if the upcase version of the name is there.
Thanks
re
"It you made the other table normally without adding quotes around the variable names then they would be in UPPERCASE. So if you tried.
on a."ar_id" = b."ar_id"
Then it will find the variable SAS created but not the other one."
This was not what happened and our DBA explained that. I used
on a."ar_id" = ar_id and it worked fine. I asked the DBA why. They explained that Snowflake is uppercasing all column names in the query. We dont see this, its all behind the scenes.
so
a.ar_id = b.ar_id failed as it was converted to UPPERCASE and SAS created column a.AR_ID did not exist. However Snowflake column b.AR_ID does
a."ar_id"=b.ar_id worked as it was converted to UPPERCASE except the quoted string from SAS created table
With OPTIONS VALIDVARNAME=UPCASE everything works and that is my go forward approach here.
What I cant get out of DBA, and they may not know, is whether any options are set that make this happen that way, or whether this is just how Snow works.
My concern now is that someone does create a permanent table (thats outside our area) and manages to set a column name lowercase. We know SAS can do it, so no reason I can think why something else couldnt on a permanent table. DBA assures me they cant, but agree from SAS its possible. Fortunately as far as I can tell, updating the actual database objects is restricted and evidently controlled by Snowflake to ensure colums are UC.
So your DBA agreed with what I said.
I seriously doubt if there is a way to change it. That is the method that Postgres used. And since Snowflake is a flavor of Postgres you are stuck with it.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.