06-02-2014 02:43 AM
I have a question about encoding in sas. Im doing some project and i need to do it in Cyrillic.
I have not yet got the full data, but i was experimenting with the following code:
data x (encoding='wcyrillic');
length str $20;
str = 'уникод';
data x (encoding=wcyrillic);
length str $20;
str = 'уникод';
Both codes give an error.
Please help, how to solve such issues.
06-02-2014 03:41 AM
You must use a dbcs installation option of SAS using/supporting Unicode. Most installations are done in the single-byte approach thinking in the world of latin1 (only English like languages with several code-pages).
Eguide is supporting utf8 at the client-side you can type the code as you have shown, the sas process could be limited to latin1. Analytics-u is supporting utf8 in the SAS session.
Remember utf8 is a variable way of encoding characters it can be really challenging for oldies. See: SAS(R) 9.4 National Language Support (NLS): Reference Guide, Second Edition
See the note about I18N level2 functions. The old single-byte thinking is problematic.
You can store your SAS code on the OS as usual but it will get a bom-marker (byte-order marker) indicating the utf8 usage.
Not all other tools are able to work with that. Some/many SAS config files must be old fashioned Ascii. A tool like notepad++ can help you with files at that level.
06-02-2014 04:04 AM
Is it possible to make such changes in one time sas session and return to the defaults after i quit SAS?
It is one time small project, and i guess, i will not need Cyrillic in near future.
Here is the output from
proc options group=languagecontrol; run;
DATESTYLE=MDY Identify sequence of month, day and year when ANYDATE informat data is
DFLANG=ENGLISH Language for EURDF date/time formats and informats
NOLOCALELANGCHG Do not change the language of SAS message text in ODS output when the
LOCALE option is specified
PAPERSIZE=LETTER Size of paper to print on
RSASIOTRANSERROR Display a transcoding error when illegal data values for a remote
Names of translate tables
Specifies URL percent encoding for the URLENCODE and URLDECODE functions
NODBCS Do not process double byte character sets
DBCSLANG=NONE Specifies the double-byte character set (DBCS) language to use
DBCSTYPE=NONE Specifies a double-byte character set (DBCS) encoding method
ENCODING=WLATIN1 Specifies default encoding for internal processing of data
LOCALE=EN_US Specifies the current locale for the SAS session
NONLSCOMPATMODE Uses the user specified encoding to process character data
06-02-2014 04:13 AM
I do not see your SAS version/release or Windows/Unix environment. In both cases when the installation has done with all language support options.
There is a technical part you can switch between those different language versions.
It is not only Cyrillic but all languages. With the Utf8 version of a SAS session you should be able to process them all.
Why back to the Hollerith approach? For some limitations (like manframes do not support utf8) you need both options.
06-02-2014 06:12 AM
With Windows you can check the map: bit version of SAS). in this map you should find those nls folders like "en" English and "u8".
When your installation is missing that "u8" the person installing SAS did not set the selection utf8 /dbcs/ button for that, while installing SAS.
In the u8 folder you will find a sasv9.cfg file. This is the one when activated will run SAS in "u8" mode. The default sasv9.cfg in 9.3 folder is just a pointer to the "en" version.
Knowing this it should be very easy to have those different encoding session being get to run.
It is not that very complicated.
The real complication is understanding utf8 in the first place.