Hi Everyone,
I am getting below error message when trying to run below code with ENCODING=WLATIN1 from SAS Studio. Can you please suggest on it.
Note : The same code is running fine with ENCODING=UTF-8
DATA T1;
var1="aábcčdď";
run;
If anyone can suggest here, would be great help.
If you can get Czech characters to work OK using UTF-8 coding then what is your question? From my understanding the Latin1 does not include Czech characters but Latin2 does. See Wikipedia: https://en.m.wikipedia.org/wiki/ISO/IEC_8859-2
I suggest you open a track with Tech Support to confirm which encodings in SAS support Czech characters.
Use UTF-8 session to see what actual bytes SAS is storing for that string:
73 data T1; 74 var1='aábcčdď'; 75 put var1= / var1 $hex.; 76 run; var1=aábcčdď 61C3A16263C48D64C48F
Two of those characters do not exist in the LATIN1 encoding.
U+010D | č | c4 8d | LATIN SMALL LETTER C WITH CARON |
U+010F | ď | c4 8f | LATIN SMALL LETTER D WITH CARON |
One of them does, but would use only one byte and not two in LATIN1.
U+00E2 | â | c3 a2 | LATIN SMALL LETTER A WITH CIRCUMFLEX |
How do you propose to enter them into your program when using an encoding that does not support those characters? How do you propose to print them?
In theory you could write the HEX code for those characters instead.
73 data T1; 74 var1='aábcčdď'; 75 var2='a'||'c3a1'x||'bc'||'c48d'x||'d'||'c48f'x; 76 put var1= / var2= / var1=$hex. / var2=$hex.; 77 run; var1=aábcčdď var2=aábcčdď var1=61C3A16263C48D64C48F var2=61C3A16263C48D64C48F
Thanks everyone for the reply.
Actually, this is not the original test case, I can try to explain a bit.
We have integrated into SAS Studio to access SharePoint sites, so ideally we can import/export data between SAS Studio to Sharepoint.
As you know, this connection is based on API's from Microsoft and from sas we can get response and push the file on sharepoint via proc https.
When we are trying to get response into json from sharepoint from sas studio(with proc https) , the json is not coming correctly with wlatin(u8 is all ok) encoding when reading details from sharepoint for Czech character(name they have some special chacter).
For example : If column from sharepoint name are czech republic character , them in json it is reflecting display name as "displayName":"Veronika Va\u0161t\u00edkov\u00e1"}} why it is coming as because it has first name + last name as Czech character(Veronika Vaštíková), because of this , we are getting below error message.
MPRINT(IMPORTBATCH.REFRESH): ;
MPRINT(IMPORTBATCH): ;
MPRINT(IMPORTBATCH): filename resp temp;
MPRINT(IMPORTBATCH): proc http url="https://graph.microsoft.com/v1.0/sites/abcdgroup.sharepoint.com:/sites/12abc:/drive"
oauth_bearer="." out = resp;
MPRINT(IMPORTBATCH): run;
NOTE: 401 Unauthorized
NOTE: PROCEDURE HTTP used (Total process time):
real time 0.18 seconds
cpu time 0.01 seconds
So why not just only use the UTF-8 sessions of SAS?
Why do you need (or want) WLATIN1 sessions if you have to deal with UTF-8 characters?
well, they(end user) are getting data from mainframe, which default in wlatin nature.
they have some scheduler UAC tool, where you can command to get the data from mainframe with encoding=en/u8 parameter.
so, they have written one sas program with logic
1.) connect mainframe from sas studio server,
2.) get data as wlatin from mainframe to sas studio server.
3). then push same data/file to sharepoint.
when they tried in one go through UAC scheduler, with encoding U8, it ran fine, but issues with EN(waltin), as json is no able to convert Czech character.
Reading data FROM a source that is in WLATIN1 into a SAS session that is using UTF-8 should not be a problem. What is the issue they are having with this?
Reading a dataset written with encoding=WLATIN1 into a dataset using encoding=UTF-8 should also not be a problem.
The only thing you need to guard against is that if your WLATIN1 source data has some non-7bit ASCII characters (those characters with accents or en-dash or stupid quotes for example) then you might need to make the target SAS dataset have longer character variables since some UTF-8 character take more than one byte of storage.
SAS have some tools to try and automatically adjust the length of character variables. Basically you give a mulitplication factor and it just expands every character variable by that factor.
Or you could use ENCODING=ANY dataset option on the input dataset and write your own logic using KCVT() and other functions to calculate the length you need for each variable based on the data that is actually in the dataset.
You have lost me.
I thought you said it worked when using SAS session with UTF-8 encoding?
Is the UTF-8 SAS session the one what is writing the JSON file with the non-ANSI characters using the encoding? Or is it the WLATIN1 SAS session that is doing that? If the later why do you continue to use the WLATIN1 SAS sessions? Is there some other process that breaks when using SAS utf-8 sessions?
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.