Hi all,
We've tried many solutions here but are encountering a problem still. I am opening an ODBC to a Postgres DB like this:
libname sql_read odbc dsn=OC64 schema=my_schema;
And it works, but the problem is the database is encoded with UTF-8, and some columns with unicode characters get translated over to SAS as non-unicode characters. For example,
≥
gets turned into "=".
We've tested the exact same driver and ODBC in Python, and this problem doesn't exist, i.e. it is not mis-translating the data. So that suggests the problem is not with the driver or ODBC.
Some other things we've tried are:
1. Ensuring the SAS session encoding is UTF-8. (Verifying this by doing a datalines with UTF-8 characters, which are correctly read)
2. Ensuring the column read into SAS has enough characters (it is character type and length 1024)
3. Trying to use the "correctencoding" argument, e.g.
proc datasets library=perm nodetails nolist;
modify my_table/ correctencoding='utf8';
quit;
4. Trying to use the "outencoding" argument on the library, e.g.
libname perm "my_path" outencoding='UTF-8';
5. Trying to use a PROC SQL odbc passthrough instead of the libname statement (same result).
Thank you for any help.
Hmm, usually encoding problems are solved by aligning encoding settings between applications. It looks like there are some character set differences between Postgres's UTF-8 and SAS's. I'd suggest raising a Tech Support track about this as they are in a better position to help.
What is your SAS session encoding?
proc options option = encoding ;
run;
It's UTF-8 according to that proc. I am opening SAS by pointing it to the UTF8 config file:
"C:\Program Files\SASHome\SASFoundation\9.4\sas.exe" -CONFIG "C:\Program Files\SASHome\SASFoundation\9.4\nls\u8\sasv9.cfg"
Thanks.
Hmm, usually encoding problems are solved by aligning encoding settings between applications. It looks like there are some character set differences between Postgres's UTF-8 and SAS's. I'd suggest raising a Tech Support track about this as they are in a better position to help.
For anyone looking for solutions to this...
I just had the same problem - UTF8 sas session, UTF8 postgres db, but still getting ANSII characters.
I have solved it by adding connect settings in ODBC sources:
SET CLIENT_ENCODING TO 'UTF8'
Calling all data scientists and open-source enthusiasts! Want to solve real problems that impact your company or the world? Register to hack by August 31st!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.