Hi All
I have searched and searched the web for answers, but am just not winning. I am reading a json file with unicode characters that are causing the error "Some code points did not transcode.". Some of the specific values are:
- \"
- \udigits
I have tried this: filename test '/filepath/filename.json' encoding='UTF-8';
libname data JSON fileref=test ordinalcount=ALL;
but I still get the same error. Any ideas please? Unfortunately I cannot attach the file as it contains sensitive data.
ok, so the first step is to locate your sas home directory.
Run the following in your SAS Session to see the directory location:
filename f "!SASHOME";
filename f list;
you should see something like:
/path/to/sashome
in your log.
Next step:
Save the code you want to run in a file in your home directory, let say file will be "mycode.sas".
You can do it from your sas session by running something like:
data _null_;
file '~/mycode.sas';
infile CARDS4;
input;
put _infile_;
CARDS4;
/* your code starts here */
libname x "/some/path/to/some/data";
data mydata;
set x.someDataSet;
run;
/* your code ends here */
;;;;
run;
When you are ready, open Putty.
Navigate to your home directory by typing:
cd ~/
and hit Enter.
Type:
ls -l
to see if the "mycode.sas" is there.
Then run the following line from command line:
/path/to/sashome/SASFoundation/9.4/sas -encoding utf-8 ./mycode.sas
the log from the execution will be created automatically in your home directory as "mycode.log"
You can add:
%put &=sysencoding.;
at the beginning of your code to see if the command line session started in UTF-8
Bart
Is your SAS session running with unicode support? Check the value of the system option ENCODING.
%put %sysfunc(getoption(encoding));
If you are using a single byte encoding like WLATIN1 then there are MANY unicode values that cannot be transcoded into one of the 256 possible single byte characters.
Try reading the file in a SAS session that was started with ENCODING='UTF-8'
Thanks!
Encoding = LATIN1. We are running SAS on servers, so I don't have the ability to change that. I can of course as our admin guys ro do it, but I will have to prove that there will be no impact to our existing processes - a major undertaking. Is that the only way to do this? So specifying encoding = 'UTF-8' in the filename statement is not sufficient?
There is a separate place in hell for admins who do not set SAS server to utf-8 in XXI century. ;-D But jokes aside.
Do you have command line access to the server, by PuTTy, MobaXterm or similar software?
If you do, you can rum sas session in command line with utf-8 encoding and then execute your code.
Let us know if you can, and if you need any support. Process is pretty easy, but if you never done it some support may be needed.
Bart
Hi Bart
Thanks for this! I do have access tot he server via Putty, but will need help in order to run a SAS session via the command line please - I have never done that.
ok, so the first step is to locate your sas home directory.
Run the following in your SAS Session to see the directory location:
filename f "!SASHOME";
filename f list;
you should see something like:
/path/to/sashome
in your log.
Next step:
Save the code you want to run in a file in your home directory, let say file will be "mycode.sas".
You can do it from your sas session by running something like:
data _null_;
file '~/mycode.sas';
infile CARDS4;
input;
put _infile_;
CARDS4;
/* your code starts here */
libname x "/some/path/to/some/data";
data mydata;
set x.someDataSet;
run;
/* your code ends here */
;;;;
run;
When you are ready, open Putty.
Navigate to your home directory by typing:
cd ~/
and hit Enter.
Type:
ls -l
to see if the "mycode.sas" is there.
Then run the following line from command line:
/path/to/sashome/SASFoundation/9.4/sas -encoding utf-8 ./mycode.sas
the log from the execution will be created automatically in your home directory as "mycode.log"
You can add:
%put &=sysencoding.;
at the beginning of your code to see if the command line session started in UTF-8
Bart
Thank you - this worked perfectly!! Now I will discuss changing the encoding permanently with our SAS Admins.
@HeidiDT wrote:
Thank you - this worked perfectly!! Now I will discuss changing the encoding permanently with our SAS Admins.
No. Don't change it. Instead make both available. So the USER can decide which encoding they want to use for this particular SAS session.
You don't have to change your existing processes. Just run this job that needs it with Unicode support.
Your SAS admins should be providing you with ways to run SAS with Unicode support. If you are running SAS on your PC it should be part of the normal install. You should see SAS 9.4 (English) and SAS 9.4 (Unicode Support) as available commands/apps to run in Windows. If you are running from the command line you would normally either have two different commands (sas94 and sas94_unicode for example). Or two different SAS configuration files you could point at.
If you are using SAS/studio or Enterprise Guide you should have a choice of an App server that supports Unicode.
The problem with trying to read Unicode characters when running a single byte encoding is like trying to put 10 pounds of potatoes into a five pound bag. They is no place to put the extra characters.
You can read the JSON file as just a text file. You could then parse it yourself. Or translate the unicode yourself into some sequence of ANSI characters instead. But that might take more effort than having your SAS admins just do their jobs.
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.