Dear fellow users, I'm tying to load strings that contain emojis via DI Studio (SAS 9.4) into a MySQL database. I understand that MySQL is a bit tricky when it comes to character sets but I changed the target table to utf8mb4 which definitely stores emojis. When I use the table loader transformation to append my datasets to the table in the database it returns an "invalid utf8 character string" error.
Here's what I have tried:
1. check the character sets for all parts of the procedure: my data come from jsons and are loaded via the json libname engine. They are correctly stored in a SAS work table. The mysql library is set to utfbmb4 and so is the mysql table.
2. test the database: when inserted through mysql workbench the emojis are stored just fine. The error only occurs if I use the SAS table loader.
3. check different modes of loading the data:
a) If I delete the mysql table, then recreate it and manually set it to utf8mb4 prior to running the load job the character set is reset to utf-8, thus causing the error.
b) If I set the table to utf8mb4, then enter a record manually and run the load job afterwards, the table character set isn't reset and the original record is unaffected. The table loader still returns the same error though.
c) If I delete the mysql table and then let SAS create a new one by running the table loader the new table is set to utf-8 and again the same error shows.
So this seems to be a problem with how the table loader works. Does anybody know a solution to this?
Hi @Jazzman
I think SAS is complaining about the Character Set because it doesn't match the database. Character Set issues tend to be complicated; it is probably a good idea to contact SAS Tech Support. They may have helped others with this type of thing.
Best wishes,
Jeff
Thanks for your reply! However, this is not a SAS error but a database error in my opinion. The error doesn't originate in the attempt to load data with the table loader. It only appears when the data loader encounters a character that's not utf-8 compatible.
I don't have a solution, just checking if your SAS session and data set use encoding=UTF-8.
Also, do you use ODBC? Could the ODBC driver be causing the issue?
Hello, my SAS session is in UTF-8 and so are all the work tables.
I use the SAS MySQL interface, not ODBC.
You might have to contact SAS Tech Support sadly.
Is there anything in the MySQL logs?
Long shot: Does this also happen when you don't use DI, but load directly in SAS?
I don't have access to the mysql logs unfortunately. I haven't tried recreating the problem without DI studio. However I don't expect any difference, since I would just recreate the code the DI table loader uses.
> However I don't expect any difference
Yes, long shot, there should be no differences. But there are unexpected things happening at times, and I find it best to leave no stones unturned.
The next steps are the MySQL logs and the SAS Tech Support imho, unless you can find something interesting about emojis+MySQL+UTF-8 on the web, like this page. It seems utf8mb4 should be the default, and anything else is now considered is misconfiguration.
Another idea is to use option sastrace and capture the exact communication taking place between SAS and MySQL. Something is wrong where the encoding changes after the table was created.
Good luck!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.