1. I have sorted the encoding issue (with your help and SAS' customer support). The problem was that my SAS session was invoked with the Target: "C:\Program Files\SASHome\SASFoundation\9.4\sas.exe" -CONFIG "C:\Program Files\SASHome\SASFoundation\9.4\nls\en\sasv9.cfg" I changed the "en" to "u8", and I now have the needed "ENCODING=UTF-8". 2. As I indicated, there are so many postings now, that I find them hard to sort out. So let's start with your suggestion, Eyal: data _null_;
win1255name = "אבג דהו";
put win1255name $hex20.;
/* convert to Hebrew DOS */
pcoemname = kcvt(win1255name,"pcoem862");
put pcoemname $hex20.;
/* convert to UTF8 */
utfname = kcvt(win1255name,"utf8");
put utfname $hex20.;
/* convert to Unicode NCR */
utf8ncr = unicodec(win1255name,"NCR");
put utf8ncr ;
/* convert to Unicode ESC */
utf8esc = unicodec(win1255name,"ESC");
put utf8esc ;
/* convert back to Hebrew using unicode or kcvt */
win1255name2 = unicode(utfname,"utf8");
put win1255name2 $hex20.;
run; Having run this code, I get the following Log. 5
6 data _null_;
7 win1255name = "×בג דהו";
8 put win1255name $hex20.;
9 /* convert to Hebrew DOS */
10 pcoemname = kcvt(win1255name,"pcoem862");
11 put pcoemname $hex20.;
12 /* convert to UTF8 */
13 utfname = kcvt(win1255name,"utf8");
14 put utfname $hex20.;
15 /* convert to Unicode NCR */
16 utf8ncr = unicodec(win1255name,"NCR");
17 put utf8ncr ;
18 /* convert to Unicode ESC */
19 utf8esc = unicodec(win1255name,"ESC");
20 put utf8esc ;
21 /* convert back to Hebrew using unicode or kcvt */
22 win1255name2 = unicode(utfname,"utf8");
23 put win1255name2 $hex20.;
24 run;
D790D791D79220D793D7
80818220838485202020
D790D791D79220D793D7
אבג דהו
\u05D0\u05D1\u05D2 \u05D3\u05D4\u05D5
D790D791D79220D793D7 Of the 6 different options, the (almost) correct function is: utf8esc = unicodec(win1255name,"ESC"); Which produces: \u05D0\u05D1\u05D2 \u05D3\u05D4\u05D5 However, this function adds (1) "\" and "u" in front of the unicodes, additions that my %label statement doesn't understand; and (2) an actual space between the two Hebrew strings instead of the unicode "0020" for a space. 3. Is there a way to "clean" this resulting string so it produces only the only the necessary unicodes? I won't be surprised if Shmuel had already solved it in his posts, but I'd appreciate having a fresh summary instead of having to sort it out myself.... Thanks, Jonathan
... View more