Architecting, installing and maintaining your SAS environment

Impact of changing default sas session encoding from wlatin1 to utf-8?

Reply
Occasional Contributor
Posts: 5

Impact of changing default sas session encoding from wlatin1 to utf-8?

Hello SAS Gurus, recently one of the DataWarehouse database code page is changed to unicode code page and we would like to test with SAS EG (5.3) application whether SAS can interpret unicode/multi byte characters properly or not.

For, this a simple test is to create a sample file with multibyte character and try to read it from SAS EG and print it as output. PFA document for input file, sas code, output and sas log.

After futher analysis I found SAS Note http://support.sas.com/kb/51586 and it is suggesting to change sas executable to use utf-8 sas executable.

My Conerns/questions:
- There are approx 400 users and doing this system wide change, I would like to know impact
- Does it impacts metadata server?
- Does any of the users need to change their programs or is there any known issues with sas functions that won't work as in latin1 character set?
- Any additional disk space required for sas work space?
- Any additional memory will be utilized at run time?
- Any performance changes w.r.t run time of sas program?
- If some issue happened can I simply roll it back?

- Can this be achieved with out doing system wide change?

 

Did any one made this kind of change in past? Appreciate your inputs.

Trusted Advisor
Posts: 1,141

Re: Impact of changing default sas session encoding from wlatin1 to utf-8?

 

- There are approx 400 users and doing this system wide change, I would like to know impact

 

will you need the unicode server for all the 400 users or only batch jobs? That is a good initial question.


- Does it impacts metadata server?

yes it does, all the metadata registered will have double size on the lenghts on character fields. That is the change from syngle to double byte characters.

 

- Does any of the users need to change their programs or is there any known issues with sas functions that won't work as in latin1 character set?

There might be some required changes. But this is only a maybe. Just an example, delimiters and character threatment is slightly different. Also, you will need to sue incoding/outencoding in your libnames and data steps, probably:

 

INENCODING=ANY | ASCIIANY | EBCDICANY | encoding-value

overrides the encoding when you are reading (input processing) SAS data sets in the SAS library.

See INENCODING= and OUTENCODING= Options in SAS National Language Support (NLS): Reference Guide

OUTENCODING=

OUTENCODING=ANY | ASCIIANY | EBCDICANY | encoding-value

overrides the encoding when you are creating (output processing) SAS data sets in the SAS library.
See INENCODING= and OUTENCODING= Options in SAS National Language Support (NLS): Reference Guide


- Any additional disk space required for sas work space?

It depends if you use the comprsin option and such, but in theory yes, between 1.5 times and 2 times the current size


- Any additional memory will be utilized at run time?

yes, but I doubt it will be really noticeable.


- Any performance changes w.r.t run time of sas program?

yes, but I doubt it will be really noticeable.


- If some issue happened can I simply roll it back?

with a good procedure and knowing what you do, you can always roll back. be carefull with the batch processes.

 

- Can this be achieved with out doing system wide change?

The best option is to create an additional and separated SASApp server, to be set as utf8, and let only the required processes and users to use it. Or just use the libname statements.

 

An aditional remark: 

I would start by a test on SAS Base only, no metadata involved, and no important data/ processes involved.

You can test inencoding and outencoding parameters on libnames data steps, see how it does work on your data and reports.

Those parameters won't work on data from ODBC, since it is not supported.

 

With this, you can achieve 2 things:

1- You evaluate already the impact on your code and system of this change, while minimizing the risk of the change.

- You can take a good decission if your change should happen just on the code, on the SASApp level or a full reconfig of your system (normally not required)

SAS Employee
Posts: 8

Re: Impact of changing default sas session encoding from wlatin1 to utf-8?

The white paper titled Multilingual Computing with SAS 9.4 has information that you may find helpful. It was written for 9.4, but several of the features documented there are available with earlier releases of SAS 9.

http://support.sas.com/resources/papers/Multilingual_Computing_with_SAS_94.pdf 

 

For example, the SAS string functions (eg. INDEX, SUBSTR, etc) assume that one byte is equal to one character. If all of the characters in your data are ASCII (those you see on a QWERTY keyboard), no changes are needed. However, if you have multibyte characters in your data, you need to use the K functions. The National Language Support (NLS): Reference Guide has a table showing the compatibility level for each SAS string function. See the section "Internationalization Compatibility for SAS String Functions" in the Functions for NLS section of the document.

 

Ask a Question
Discussion stats
  • 2 replies
  • 461 views
  • 0 likes
  • 3 in conversation