BookmarkSubscribeRSS Feed
kumarsandip975
Quartz | Level 8

Hi Everyone,

 

I am getting below error message when trying to run below code with ENCODING=WLATIN1 from SAS Studio. Can you please suggest on it. 

Note : The same code is running fine with ENCODING=UTF-8

 

DATA T1;
var1="aábcčdď";
run;

 

kumarsandip975_0-1718054975484.png

 

12 REPLIES 12
kumarsandip975
Quartz | Level 8
Additionally, I tried something to make Czech republic character , but same error message.

https://support.sas.com/documentation/cdl/en/nlsref/61893/HTML/default/viewer.htm#a002613623.htm

data t1 encoding=wlatin2 locale=cs_CZ DFLANG=Czech DATESTYLE=DMY PAPERSIZE=A4;
var1="aábcčdď";
run;
kumarsandip975
Quartz | Level 8

If anyone can suggest here, would be great help. 

SASKiwi
PROC Star

If you can get Czech characters to work OK using UTF-8 coding then what is your question? From my understanding the Latin1 does not include Czech characters but Latin2 does. See Wikipedia: https://en.m.wikipedia.org/wiki/ISO/IEC_8859-2

 

I suggest you open a track with Tech Support to confirm which encodings in SAS support Czech characters. 

Tom
Super User Tom
Super User

Use UTF-8 session to see what actual bytes SAS is storing for that string:

 

 73         data T1;
 74           var1='aábcčdď';
 75           put var1= / var1 $hex.;
 76         run;
 
 var1=aábcčdď
 61C3A16263C48D64C48F

 

 

Two of those characters do not exist in the LATIN1 encoding.

U+010D č c4 8d LATIN SMALL LETTER C WITH CARON

 

U+010F ď c4 8f LATIN SMALL LETTER D WITH CARON

 

One of them does, but would use only one byte and not two in LATIN1.

U+00E2 â c3 a2 LATIN SMALL LETTER A WITH CIRCUMFLEX

 

How do you propose to enter them into your program when using an encoding that does not support those characters?  How do you propose to print them?

 

In theory you could write the HEX code for those characters instead.

 73         data T1;
 74           var1='aábcčdď';
 75           var2='a'||'c3a1'x||'bc'||'c48d'x||'d'||'c48f'x;
 76           put var1= / var2= / var1=$hex. / var2=$hex.;
 77         run;
 
 var1=aábcčdď
 var2=aábcčdď
 var1=61C3A16263C48D64C48F
 var2=61C3A16263C48D64C48F

 

kumarsandip975
Quartz | Level 8

Thanks everyone for the reply.

Actually, this is not the original test case, I can try to explain a bit.

 

We have integrated into SAS Studio to access SharePoint sites, so ideally we can import/export data between SAS Studio to Sharepoint. 

 

As you know, this connection is based on API's from Microsoft and from sas we can get response and push the file on sharepoint via proc https. 

 

When we are trying to get response into json from sharepoint from sas studio(with proc https) , the json is not coming correctly with wlatin(u8 is all ok) encoding when reading details from sharepoint for Czech character(name they have some special chacter).
For example : If column from sharepoint name are czech republic character , them in json it is reflecting display name as "displayName":"Veronika Va\u0161t\u00edkov\u00e1"}} why it is coming as because it has first name + last name as Czech character(Veronika Vaštíková), because of this , we are getting below error message.

MPRINT(IMPORTBATCH.REFRESH): ;
MPRINT(IMPORTBATCH): ;
MPRINT(IMPORTBATCH): filename resp temp;
MPRINT(IMPORTBATCH): proc http url="https://graph.microsoft.com/v1.0/sites/abcdgroup.sharepoint.com:/sites/12abc:/drive"
oauth_bearer="." out = resp;
MPRINT(IMPORTBATCH): run;

NOTE: 401 Unauthorized
NOTE: PROCEDURE HTTP used (Total process time):
real time 0.18 seconds
cpu time 0.01 seconds

Tom
Super User Tom
Super User

So why not just only use the UTF-8 sessions of SAS?

Why do you need (or want) WLATIN1 sessions if you have to deal with UTF-8 characters?

kumarsandip975
Quartz | Level 8

well, they(end user) are getting data from mainframe, which default in wlatin nature. 
they have some scheduler UAC tool, where you can command to get the data from mainframe with encoding=en/u8 parameter.

 

so, they have written one sas program with logic
1.) connect mainframe from sas studio server,

2.) get data as wlatin from mainframe to sas studio server. 

3). then push same data/file to sharepoint. 

when they tried in one go through UAC scheduler, with encoding U8, it ran fine, but issues with EN(waltin), as json is no able to convert Czech character. 


kumarsandip975
Quartz | Level 8
We have suggested to make two UAC jobs , first get data from mainframe with wlatin, and then push data to sharepoint with u8.

For them , two many jobs monitoring is bit challenging, but that I see workaround.
Tom
Super User Tom
Super User

Reading data FROM a source that is in WLATIN1 into a SAS session that is using UTF-8 should not be a problem.  What is the issue they are having with this?

 

Reading a dataset written with encoding=WLATIN1 into a dataset using encoding=UTF-8 should also not be a problem.

 

The only thing you need to guard against is that if your WLATIN1 source data has some non-7bit ASCII characters (those characters with accents or en-dash or stupid quotes for example) then you might need to make the target SAS dataset have longer character variables since some UTF-8 character take more than one byte of storage.  

 

SAS have some tools to try and automatically adjust the length of character variables.  Basically you give a mulitplication factor and it just expands every character variable by that factor.

 

Or you could use ENCODING=ANY dataset option on the input dataset and write your own logic using KCVT() and other functions to calculate the length you need for each variable based on the data that is actually in the dataset.

kumarsandip975
Quartz | Level 8
I think some misunderstanding . Let me clear.

issue is not with data .

See when you connect to sharepoint from sas, it creates json file which contains the details of sharepoint , for example, urls, file details, last modified name, if person have name with special character (here Czech name), it creates like ""displayName":"Veronika Va\u0161t\u00edkov\u00e1"}}" , normally it should reflectlike this ""displayName":"Veronika Veronika Vaštíková"} in json file.

Once you get json file correctly, move towards sharepoint does really check your source data(which they have received from mainframe).

Conclusion - issues with json file, which has SharePoint sites data as I explained, not source data.
Tom
Super User Tom
Super User

You have lost me.

 

I thought you said it worked when using SAS session with UTF-8 encoding?

 

Is the UTF-8 SAS session the one what is writing the JSON file with the non-ANSI characters using the encoding?  Or is it the WLATIN1 SAS session that is doing that?  If the later why do you continue to use the WLATIN1 SAS sessions?  Is there some other process that breaks when using SAS utf-8 sessions?

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 506 views
  • 0 likes
  • 3 in conversation