Is your session encoding UTF-8?
proc options option = encoding;
run;
Quentin, I did extensive testing on this issue in SAS and other tools. In SAS, I used SAS Unicode (utf-8) and SAS English (wlatin1).
My workaround in SAS Unicode is to run PROC DATASETS like below every time I pull in data from Snowflake, but it only gives me iso-8859-1, which seems to be a limitation of the Snowflake ODBC driver.
proc datasets library=&lib noprint;
modify &ds / correctencoding='iso-8859-1';
quit;
@AndrewZ wrote:
Quentin, I did extensive testing on this issue in SAS and other tools. In SAS, I used SAS Unicode (utf-8) and SAS English (wlatin1).
My workaround in SAS Unicode is to run PROC DATASETS like below every time I pull in data from Snowflake, but it only gives me iso-8859-1, which seems to be a limitation of the Snowflake ODBC driver.
proc datasets library=&lib noprint;
modify &ds / correctencoding='iso-8859-1';
quit;
I am not sure what your second paragraph means.
Did you look at the hexcodes in the dataset? Were they the valid UTF-8 bytes you expected?
That PROC DATASETS code will just change the metadata attribute that indicates the encoding used to create the file. Changing the metadata about the encoding of the text in the dataset will not change what is in the dataset. It just tells future users of the data what to expect to find when they look at the data.
Did you look at the hexcodes in the dataset? Were they the valid UTF-8 bytes you expected?
You mean use a hex editor on the .sas7bdat file? No. Based on my other tests (like one in the next paragraph), Snowflake was not sending UTF-8.
That PROC DATASETS code will just change the metadata attribute that indicates the encoding used to create the file. Changing the metadata about the encoding of the text in the dataset will not change what is in the dataset. It just tells future users of the data what to expect to find when they look at the data.
Have you tried this Libname option?
DBCLIENT_MAX_BYTES= LIBNAME Statement Option
One think to keep in mind, Variable length in Snowflake are based on character count, while in SAS, they are based on Byte count!
Therefore what could be stored in Snowflake within a char/varchar (1) may require a SAS variable of length 2+ in order to correctly display the values.
Hope this helps
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.