<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic encoding for proc http to sas7bdat, without corrupting characters in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626520#M184815</link>
    <description>&lt;P&gt;I've searched to communities for solutions to this, or at least explanations, but find none.&lt;/P&gt;&lt;P&gt;Any insights would be helpful.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is example code, to illustrate the problems.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Objective: Read a UTF-8 encoded page; parse it to sas variables; and store as sas7bdat ... &lt;STRONG&gt;without corrupting&lt;/STRONG&gt; &lt;STRONG&gt;UTF-8 chars&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Example code.&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%let w_encoding = UTF-8;
%let r_encoding = UTF-8;

filename WRITE './unitslab_conversions.txt' encoding="&amp;amp;W_ENCODING";
filename  READ './unitslab_conversions.txt' encoding="&amp;amp;R_ENCODING";

libname data './data' inencoding="&amp;amp;R_ENCODING" outencoding="&amp;amp;W_ENCODING";

proc http
  method="GET"
  url="https://unitslab.com/"
  out=WRITE ;
run;

data data.lbtests;
  infile READ length=len lrecl=32767;
  input line $varying32767. len;

  line = strip(line);

  if prxmatch('/^&amp;lt;li&amp;gt;.+\/node/i', line);
  if prxmatch('/(microglobulin|cancer|beta|kappa|mass|mullerian)/i', line);

  lbtest = strip(scan(line,5,'&amp;lt;&amp;gt;'));
  node   = strip(scan(line,2,'"'));

*--- Paste in strings from the UTF-8 page 
     NB - NONE OF THESE MATCH ;
  if    index(lbtest, 'anti-Mullerian')
     or index(lbtest, 'Beta 2-microglobulin (ß2-M)')
     or index(lbtest, 'Free ß-subunit of human chorionic gonadotropin (free ßhCG)')
     or index(lbtest, 'Kappa (κ)')
     then putlog 'INFO: FOUND EXPECTED string match: ' lbtest=;

*--- https://www.w3schools.com/charsets/ref_html_utf8.asp 
     NB - LIMITED CHAR RANGE ALSO MATCH TRANSCODE (CORRUPTED) chars above \x{2000} ;
  if prxmatch('/[\x{03b2}\x{052f}]/', line)
     then putlog 'INFO: FOUND EXPECTED HTML5 UTF-8 char: ' lbtest=;

  keep lbtest node;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Problems with results:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1 - SAS log - Notice the corrupted UTF-8 chars - I've bolded some UTF-8 hyphens and greek chars:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;lbtest=alpha-1&lt;STRONG&gt;â€‘&lt;/STRONG&gt;microglobulin node=/node/89
lbtest=anti&lt;STRONG&gt;â€&amp;#144;&lt;/STRONG&gt;Mullerian hormone (AMH) node=/node/155
lbtest=beta - CrossLaps - Degradation products of type I collagen node=/node/164
lbtest=Beta 2&lt;STRONG&gt;â€&amp;#144;&lt;/STRONG&gt;microglobulin (&lt;STRONG&gt;Î²&lt;/STRONG&gt;2&lt;STRONG&gt;â€&amp;#144;&lt;/STRONG&gt;M) node=/node/145
lbtest=beta-Hydroxybutyric acid node=/node/225
lbtest=CA 125 (Cancer Antigen 125) node=/node/104
lbtest=CA 15-3 (Cancer Antigen 15-3) node=/node/105
lbtest=CA 72-4 (Antig&lt;STRONG&gt;Ã¨&lt;/STRONG&gt;ne de cancer 72-4) node=/node/107
lbtest=CK&lt;STRONG&gt;â€&amp;#144;&lt;/STRONG&gt;MB mass - the MB isoenzyme of creatine kinase (quantitative determination) node=/node/157
lbtest=Kappa (&lt;STRONG&gt;Îº&lt;/STRONG&gt;) light chain node=/node/150&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2 - Similar in the resulting SAS7BDAT, despite encoding being &lt;STRONG&gt;UTF-8,&lt;/STRONG&gt; but different corruption, appearing to eat up more char bytes:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="sas7bdat.PNG" style="width: 548px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/36302i21DD6E3074A6706B/image-size/large?v=v2&amp;amp;px=999" role="button" title="sas7bdat.PNG" alt="sas7bdat.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;3 - Similar result viewing SAS7BDAT outside a SAS session - UNIVERSAL VIEWER, but still different corruption of UTF-8 chars:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="uv_sas7bdat.PNG" style="width: 488px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/36303i389F353F986031D2/image-size/large?v=v2&amp;amp;px=999" role="button" title="uv_sas7bdat.PNG" alt="uv_sas7bdat.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for any insights, tips or fixes to this code that can avoid this UTF-8 character corruption.&lt;/P&gt;&lt;P&gt;GG&lt;/P&gt;</description>
    <pubDate>Sun, 23 Feb 2020 19:56:53 GMT</pubDate>
    <dc:creator>GGO</dc:creator>
    <dc:date>2020-02-23T19:56:53Z</dc:date>
    <item>
      <title>encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626520#M184815</link>
      <description>&lt;P&gt;I've searched to communities for solutions to this, or at least explanations, but find none.&lt;/P&gt;&lt;P&gt;Any insights would be helpful.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is example code, to illustrate the problems.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Objective: Read a UTF-8 encoded page; parse it to sas variables; and store as sas7bdat ... &lt;STRONG&gt;without corrupting&lt;/STRONG&gt; &lt;STRONG&gt;UTF-8 chars&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Example code.&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%let w_encoding = UTF-8;
%let r_encoding = UTF-8;

filename WRITE './unitslab_conversions.txt' encoding="&amp;amp;W_ENCODING";
filename  READ './unitslab_conversions.txt' encoding="&amp;amp;R_ENCODING";

libname data './data' inencoding="&amp;amp;R_ENCODING" outencoding="&amp;amp;W_ENCODING";

proc http
  method="GET"
  url="https://unitslab.com/"
  out=WRITE ;
run;

data data.lbtests;
  infile READ length=len lrecl=32767;
  input line $varying32767. len;

  line = strip(line);

  if prxmatch('/^&amp;lt;li&amp;gt;.+\/node/i', line);
  if prxmatch('/(microglobulin|cancer|beta|kappa|mass|mullerian)/i', line);

  lbtest = strip(scan(line,5,'&amp;lt;&amp;gt;'));
  node   = strip(scan(line,2,'"'));

*--- Paste in strings from the UTF-8 page 
     NB - NONE OF THESE MATCH ;
  if    index(lbtest, 'anti-Mullerian')
     or index(lbtest, 'Beta 2-microglobulin (ß2-M)')
     or index(lbtest, 'Free ß-subunit of human chorionic gonadotropin (free ßhCG)')
     or index(lbtest, 'Kappa (κ)')
     then putlog 'INFO: FOUND EXPECTED string match: ' lbtest=;

*--- https://www.w3schools.com/charsets/ref_html_utf8.asp 
     NB - LIMITED CHAR RANGE ALSO MATCH TRANSCODE (CORRUPTED) chars above \x{2000} ;
  if prxmatch('/[\x{03b2}\x{052f}]/', line)
     then putlog 'INFO: FOUND EXPECTED HTML5 UTF-8 char: ' lbtest=;

  keep lbtest node;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Problems with results:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1 - SAS log - Notice the corrupted UTF-8 chars - I've bolded some UTF-8 hyphens and greek chars:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;lbtest=alpha-1&lt;STRONG&gt;â€‘&lt;/STRONG&gt;microglobulin node=/node/89
lbtest=anti&lt;STRONG&gt;â€&amp;#144;&lt;/STRONG&gt;Mullerian hormone (AMH) node=/node/155
lbtest=beta - CrossLaps - Degradation products of type I collagen node=/node/164
lbtest=Beta 2&lt;STRONG&gt;â€&amp;#144;&lt;/STRONG&gt;microglobulin (&lt;STRONG&gt;Î²&lt;/STRONG&gt;2&lt;STRONG&gt;â€&amp;#144;&lt;/STRONG&gt;M) node=/node/145
lbtest=beta-Hydroxybutyric acid node=/node/225
lbtest=CA 125 (Cancer Antigen 125) node=/node/104
lbtest=CA 15-3 (Cancer Antigen 15-3) node=/node/105
lbtest=CA 72-4 (Antig&lt;STRONG&gt;Ã¨&lt;/STRONG&gt;ne de cancer 72-4) node=/node/107
lbtest=CK&lt;STRONG&gt;â€&amp;#144;&lt;/STRONG&gt;MB mass - the MB isoenzyme of creatine kinase (quantitative determination) node=/node/157
lbtest=Kappa (&lt;STRONG&gt;Îº&lt;/STRONG&gt;) light chain node=/node/150&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2 - Similar in the resulting SAS7BDAT, despite encoding being &lt;STRONG&gt;UTF-8,&lt;/STRONG&gt; but different corruption, appearing to eat up more char bytes:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="sas7bdat.PNG" style="width: 548px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/36302i21DD6E3074A6706B/image-size/large?v=v2&amp;amp;px=999" role="button" title="sas7bdat.PNG" alt="sas7bdat.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;3 - Similar result viewing SAS7BDAT outside a SAS session - UNIVERSAL VIEWER, but still different corruption of UTF-8 chars:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="uv_sas7bdat.PNG" style="width: 488px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/36303i389F353F986031D2/image-size/large?v=v2&amp;amp;px=999" role="button" title="uv_sas7bdat.PNG" alt="uv_sas7bdat.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for any insights, tips or fixes to this code that can avoid this UTF-8 character corruption.&lt;/P&gt;&lt;P&gt;GG&lt;/P&gt;</description>
      <pubDate>Sun, 23 Feb 2020 19:56:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626520#M184815</guid>
      <dc:creator>GGO</dc:creator>
      <dc:date>2020-02-23T19:56:53Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626683#M184867</link>
      <description>&lt;P&gt;I might be wrong but I don't think what you are showing means there is corruption.&lt;/P&gt;
&lt;P&gt;It might be that the data is fine but the viewers don't support UTF8.&lt;/P&gt;
&lt;P&gt;What's the encoding of your SAS environment?&lt;/P&gt;
&lt;P&gt;Can you see the txt files in a proper viewer such as notepad++?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 24 Feb 2020 00:57:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626683#M184867</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-02-24T00:57:29Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626746#M184885</link>
      <description>&lt;P&gt;Thanks, Chris - Unfortunately, it is not only a display problem.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Bottom line: SAS corrupts characters in the HTML5 UTF-8 range above the Greek/Cyrillic chars.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Reference for char hex values: w3schools page &lt;A href="https://www.w3schools.com/charsets/ref_html_utf8.asp" target="_self"&gt;HTML UTF-8 encoding&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Notepad++ preserves but fails to display&lt;/STRONG&gt; HTML UTF-8 chars in the range &lt;FONT face="courier new,courier"&gt;&lt;STRONG&gt;\x{2000}-\x{27bf}&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;According to the W3 HTML UTF-8 page, above, these are &lt;STRONG&gt;all chars after "Cyrillic Supplement"&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;But at least Notepad++ preserves the correct chars, despite display problems - regex &lt;FONT face="courier new,courier"&gt;[\x{2000}-\x{27bf}]&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="npp-utf-8-display-BAD.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/36335iDB378CCB5D28BAB6/image-size/medium?v=v2&amp;amp;px=400" role="button" title="npp-utf-8-display-BAD.png" alt="npp-utf-8-display-BAD.png" /&gt;&lt;/span&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;By comparison, SAS 9.04 corrupts chars at least above \x{052F}.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;I've updated the code above to demonstrate this.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The resulting log file snippet follows. SAS and Notepad++ have similar display problems:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT face="courier new,courier"&gt;NOTE: The infile READ is:&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Filename=unitslab_conversions.txt,&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;RECFM=V,LRECL=131068,File Size (bytes)=42931,&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Last Modified=23Feb2020:10:51:35,&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Create Time=21Feb2020:08:29:19&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT face="courier new,courier"&gt;INFO: FOUND EXPECTED HTML5 UTF-8 char: lbtest=alpha-1â€‘microglobulin&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;INFO: FOUND EXPECTED HTML5 UTF-8 char: lbtest=antiâ€&amp;#144;Mullerian hormone (AMH)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;INFO: FOUND EXPECTED HTML5 UTF-8 char: lbtest=beta - CrossLaps - Degradation products of type I collagen&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;INFO: FOUND EXPECTED HTML5 UTF-8 char: lbtest=Beta 2â€&amp;#144;microglobulin (Î²2â€&amp;#144;M)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;INFO: FOUND EXPECTED HTML5 UTF-8 char: lbtest=beta-Hydroxybutyric acid&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;INFO: FOUND EXPECTED HTML5 UTF-8 char: lbtest=CA 125 (Cancer Antigen 125)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;INFO: FOUND EXPECTED HTML5 UTF-8 char: lbtest=CA 15-3 (Cancer Antigen 15-3)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;INFO: FOUND EXPECTED HTML5 UTF-8 char: lbtest=CA 72-4 (AntigÃ¨ne de cancer 72-4)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;INFO: FOUND EXPECTED HTML5 UTF-8 char: lbtest=CKâ€&amp;#144;MB mass - the MB isoenzyme of creatine kinase (quantitative determination)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;INFO: FOUND EXPECTED HTML5 UTF-8 char: lbtest=Kappa (Îº) light chain&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Note that the prxmatch not only matches on the Greek letters (small beta and kappa) but also on the hypens &lt;STRONG&gt;&lt;FONT face="courier new,courier"&gt;\x{2010}&lt;/FONT&gt;&lt;/STRONG&gt; and &lt;STRONG&gt;&lt;FONT face="courier new,courier"&gt;\x{2011}&lt;/FONT&gt;&lt;/STRONG&gt;. &lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;SAS has corrupted these UTF-8 chars into something in the Greek/Cyrillic range.&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 24 Feb 2020 02:22:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626746#M184885</guid>
      <dc:creator>GGO</dc:creator>
      <dc:date>2020-02-24T02:22:44Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626774#M184895</link>
      <description>&lt;P&gt;You are raising two different issues here:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. Matching strings&lt;/P&gt;
&lt;P&gt;UTF-8 is&amp;nbsp; &lt;EM&gt;I18N Level 2&lt;/EM&gt;. Not all SAS functions support this. See here (it's a Viya link but is true for SAS too) :&lt;/P&gt;
&lt;P&gt;&lt;A href="https://documentation.sas.com/?docsetId=nlsref&amp;amp;docsetTarget=p1pca7vwjjwucin178l8qddjn0gi.htm&amp;amp;docsetVersion=1.0&amp;amp;locale=en"&gt;https://documentation.sas.com/?docsetId=nlsref&amp;amp;docsetTarget=p1pca7vwjjwucin178l8qddjn0gi.htm&amp;amp;docsetVersion=1.0&amp;amp;locale=en&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. Displaying strings.&lt;/P&gt;
&lt;P&gt;You haven't said what encoding your SAS session uses.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3. How do the contents of filerefs READ and WRITE compare in terms of encoded values?&lt;/P&gt;</description>
      <pubDate>Mon, 24 Feb 2020 03:05:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626774#M184895</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-02-24T03:05:41Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626779#M184899</link>
      <description>&lt;P&gt;Thanks for that reference, Chris - I'll have to review (Internationalization).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I suspect that this will not explain why the approach in my code works for some UTF-8 range beyond ASCII, but not the entire HTML5 UTF-8 range.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Your point (3) is also a clever test - I'll take a look at this, as well.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've set my session encoding at start-up: &lt;FONT face="courier new,courier"&gt;-encoding ASCIIANY&lt;/FONT&gt;.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried forcing (SBCS) sessions to &lt;A href="https://go.documentation.sas.com/?cdcId=pgmsascdc&amp;amp;cdcVersion=9.4_3.4&amp;amp;docsetId=nlsref&amp;amp;docsetTarget=n1i14wwq12o5z0n1tbzapn2jxwcn.htm&amp;amp;locale=en" target="_self"&gt;&lt;FONT face="courier new,courier"&gt;-encoding "UTF-8"&lt;/FONT&gt;, which according to documentation should be valid&lt;/A&gt;, although note that utf-8 is at best an after-thought on that page (same for UNIX). But SAS fails to launch with that setting ("invalid" encoding value).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried various combinations of DBCS, ENCODING, DBCSTYPE, DBCSLANG - all result in some sort of &lt;STRONG&gt;failure message, failure to start session,&lt;/STRONG&gt; so no further testing in this direction is possible in my work environment.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Much appreciated!&lt;/P&gt;</description>
      <pubDate>Mon, 24 Feb 2020 02:35:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626779#M184899</guid>
      <dc:creator>GGO</dc:creator>
      <dc:date>2020-02-24T02:35:01Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626794#M184908</link>
      <description>&lt;P&gt;See here to start your session:&lt;/P&gt;
&lt;P&gt;&lt;A href="http://support.sas.com/kb/51/586.html" target="_blank"&gt;http://support.sas.com/kb/51/586.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note that UTF-8 is not&amp;nbsp;&lt;SPAN&gt;DBCS (double-byte), it is&amp;nbsp;MBCS (multi-byte).&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 24 Feb 2020 03:11:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626794#M184908</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-02-24T03:11:01Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626798#M184911</link>
      <description>&lt;P&gt;Again, Chris: Much appreciated, all of your nuggets of insight!&lt;/P&gt;</description>
      <pubDate>Mon, 24 Feb 2020 04:28:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626798#M184911</guid>
      <dc:creator>GGO</dc:creator>
      <dc:date>2020-02-24T04:28:38Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626812#M184922</link>
      <description>&lt;P&gt;So all is working now?&lt;/P&gt;</description>
      <pubDate>Mon, 24 Feb 2020 08:02:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626812#M184922</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-02-24T08:02:20Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626821#M184924</link>
      <description>&lt;P&gt;This discussion might interest you&lt;/P&gt;
&lt;P&gt;&lt;A href="https://www.pharmasug.org/proceedings/2018/BB/PharmaSUG-2018-BB08.pdf" target="_blank"&gt;https://www.pharmasug.org/proceedings/2018/BB/PharmaSUG-2018-BB08.pdf&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 24 Feb 2020 08:44:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/626821#M184924</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-02-24T08:44:40Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/627296#M185140</link>
      <description>&lt;P&gt;Unfortunately I have not been able to fully resolve however SAS is handling HTML5 UTF-8 chars.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Step by step - this is what works, and this is what I cannot get past in a session with &lt;STRONG&gt;-ENCODING ASCIIANY&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1 - As Chris suggested, SAS is preserving HTML UTF-8 chars in the read/write process.&lt;/P&gt;&lt;P&gt;In the following snippet (variation on above code) files "unitslab_conversions.txt" and "unitslab_conv2.txt" match exactly.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Note: TERMSTR=LF&lt;/STRONG&gt; overrides default behaviour in my Win environment, to preserve UNIX line feeds (LF)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%let w_encoding = UTF-8;
%let r_encoding = UTF-8;

filename WRITE  './unitslab_conversions.txt' encoding="&amp;amp;W_ENCODING";
filename  READ  './unitslab_conversions.txt' encoding="&amp;amp;R_ENCODING";
filename WRITE2 './unitslab_conv2.txt'       encoding="&amp;amp;W_ENCODING" TERMSTR=LF ;

libname data './data' inencoding="&amp;amp;R_ENCODING" outencoding="&amp;amp;W_ENCODING";

proc http
  method = 'GET'
     url = 'https://unitslab.com/'
     out = WRITE ;
run;

data _null_;
  infile READ length=len lrecl=32767;
  input;

  file WRITE2 lrecl=32767; 
  put _infile_;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;2 - However, within the string-searching / matching data step, above, even using K-functions do not correctly find HTML5 UTF-8 chars in those text files. They DO find what seem to be corrupted chars. EG, instead of HTML UTF-8 HYPHEN char \x{2011}, K-functions find the corrupted replacement string 'E28091'x, which matches the corrupted display chars "â€‘".&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;These searches find none of the expected characters:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;*--- https://www.w3schools.com/charsets/ref_html_utf8.asp 
     THESE DO NOT MATCH ;
  if kindex(line, '03B2'x)
     then putlog 'INFO: FOUND EXPECTED HTML5 UTF-8 BETA char: ' lbtest=;
  if kindex(line, '03BA'x)
     then putlog 'INFO: FOUND EXPECTED HTML5 UTF-8 KAPPA char: ' lbtest=;
  if kindex(line, '2010'x)
     then putlog 'INFO: FOUND EXPECTED HTML5 UTF-8 HYPHEN char: ' lbtest=;
  if kindex(line, '2011'x)
     then putlog 'INFO: FOUND EXPECTED HTML5 UTF-8 NON-BREAKING HYPHEN char: ' lbtest=;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;These search do find UNEXPECTED transcoded/corrupted chars:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;*--- https://www.w3schools.com/charsets/ref_html_utf8.asp 
     THESE DO MATCH, BUT SHOULD NOT - TRANSCODED CHARS, NOT THE ORIGINAL CHARS ;
  if kindex(line, 'E2'x)
     then putlog 'INFO: FOUND UNEXPECTED HTML5 UTF-8 â char: ' lbtest=;
  if kindex(line, '80'x)
     then putlog 'INFO: FOUND UNEXPECTED HTML5 UTF-8 € char: ' lbtest=;
  if kindex(line, '91'x)
     then putlog 'INFO: FOUND UNEXPECTED HTML5 UTF-8 ‘ char: ' lbtest=;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I am unable to resolve or make sense of this. It would be quite nice if this were somehow more transparent.&lt;/P&gt;</description>
      <pubDate>Tue, 25 Feb 2020 19:21:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/627296#M185140</guid>
      <dc:creator>GGO</dc:creator>
      <dc:date>2020-02-25T19:21:50Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/627299#M185142</link>
      <description>&lt;P&gt;And just to add that pointing to the alternative UTF-8 config file (-ENCODING UTF-8) does not help.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;That was the guidance here: &lt;A href="http://support.sas.com/kb/51/586.html" target="_blank"&gt;http://support.sas.com/kb/51/586.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 25 Feb 2020 19:35:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/627299#M185142</guid>
      <dc:creator>GGO</dc:creator>
      <dc:date>2020-02-25T19:35:45Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633838#M188058</link>
      <description>&lt;P&gt;Well, I'm stubborn, and I don't like to give up, so in case it helps anyone else, here is my summary of successfully working with UTF-8 characters.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Much of what ChrisNZ suggested is very helpful, and contributed to finally sorting this out.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Steps to check when working with UTF-8 on in &lt;STRONG&gt;SAS 9.4 Windows&lt;/STRONG&gt; Server 2016 / Win10 :&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Make sure your SAS session is set for UTF-8 encoding. This should be set in your session config, or command-line string (shortcut target) via the -CONFIG system option.&lt;/LI&gt;&lt;LI&gt;The correct config file is typically: -CONFIG "C:\Program Files\SASHome\SASFoundation\9.4\nls\u8\sasv9.cfg"&lt;/LI&gt;&lt;LI&gt;SAS 9.4 installs an application menu shortcut "SAS 9.4 (Unicode Support)" that should point you to that config file.&lt;/LI&gt;&lt;LI&gt;Check that the config file sets: &lt;STRONG&gt;-ENCODING UTF-8&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;Get to know that setting well, since you can force read/write encoding on libname and filename statements to prevent unwanted transcoding of UTF-8 chars&lt;/LI&gt;&lt;LI&gt;&lt;A href="https://go.documentation.sas.com/?cdcId=pgmsascdc&amp;amp;cdcVersion=9.4_3.4&amp;amp;docsetId=lestmtsglobal&amp;amp;docsetTarget=n1nk65k2vsfmxfn1wu17fntzszbp.htm&amp;amp;locale=en" target="_self"&gt;LIBNAME options&lt;/A&gt;: SAS 9.4 provides both INENCODING and OUTENCODING settings&lt;/LI&gt;&lt;LI&gt;&lt;A href="https://go.documentation.sas.com/?cdcId=pgmsascdc&amp;amp;cdcVersion=9.4_3.4&amp;amp;docsetId=lestmtsglobal&amp;amp;docsetTarget=p05r9vhhqbhfzun1qo9mw64s4700.htm&amp;amp;locale=en#n094h314v2s8hvn1cohy8heflbt9" target="_self"&gt;FILENAME options&lt;/A&gt;: read that SAS 9.4 documentation, which mentions that encoding setting, above.&lt;/LI&gt;&lt;LI&gt;Get to know your encoding options - &lt;A href="https://go.documentation.sas.com/?cdcId=pgmsascdc&amp;amp;cdcVersion=9.4_3.4&amp;amp;docsetId=nlsref&amp;amp;docsetTarget=n1r7pnb91iybs9n1hgvsj7q09srd.htm&amp;amp;locale=en" target="_self"&gt;SAS 9.4 Encoding Values in SAS Language Elements&lt;/A&gt;&lt;/LI&gt;&lt;LI&gt;Get to know SBCS, DBCS, MBCS and the &lt;A href="https://go.documentation.sas.com/?cdcId=pgmsascdc&amp;amp;cdcVersion=9.4_3.4&amp;amp;docsetId=nlsref&amp;amp;docsetTarget=p1pca7vwjjwucin178l8qddjn0gi.htm&amp;amp;locale=en" target="_self"&gt;SAS 9.4 Internationalization Compatibility for SAS String Functions&lt;/A&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Note: prxmatch&lt;/STRONG&gt; functions (which I started with at top) only support SBCS. So find another approach when working with DBCS, MBCS&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Note:&lt;/STRONG&gt; But also note that if you are certain that you are working with a section of the source that does not contain any DB or MB chars, then you may just get away with using the SB-only functions &lt;span class="lia-unicode-emoji" title=":crossed_fingers:"&gt;🤞&lt;/span&gt; &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Note:&lt;/STRONG&gt; some K-functions are buggy (Sorry, SAS - I'll back that up with example code, below)&lt;/LI&gt;&lt;LI&gt;That Internationalization page gives examples of how to search for hex literals in a UTF-8 session - syntax like "&amp;lt;hex-code&amp;gt;"&lt;STRONG&gt;x&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;You'll have to understand &lt;A href="https://documentation.sas.com/?docsetId=lrcon&amp;amp;docsetTarget=p0cq7f0icfjr8vn19vyunwmmsl7m.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en#n1l9fhsrv13jbtn1txnuvj61aefd" target="_self"&gt;Character Constants Expressed in Hexadecimal Notation&lt;/A&gt;&amp;nbsp;&lt;/LI&gt;&lt;LI&gt;The final piece is actually knowing the Hex Code for characters of interest. The best reference that I can find is: &lt;A href="https://www.utf8-chartable.de/unicode-utf8-table.pl?start=0&amp;amp;number=1024" target="_blank"&gt;https://www.utf8-chartable.de/unicode-utf8-table.pl?start=0&amp;amp;number=1024&lt;/A&gt; ... but it is hard to search, since there are lots of UTF-8 chars &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Good luck &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 21 Mar 2020 18:50:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633838#M188058</guid>
      <dc:creator>GGO</dc:creator>
      <dc:date>2020-03-21T18:50:06Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633851#M188066</link>
      <description>&lt;P&gt;As promised, above, buggy or at least unreliable K-functions.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://go.documentation.sas.com/?cdcId=pgmsascdc&amp;amp;cdcVersion=9.4_3.4&amp;amp;docsetId=nlsref&amp;amp;docsetTarget=n01wgwo05gbv68n1w1u68i89pbtw.htm&amp;amp;locale=en" target="_self"&gt;KCOMPRESS()&lt;/A&gt; should have a similar interface, including modifiers, as &lt;A href="https://go.documentation.sas.com/?cdcId=pgmsascdc&amp;amp;cdcVersion=9.4_3.4&amp;amp;docsetId=nlsref&amp;amp;docsetTarget=n0fcshr0ir3h73n1b845c4aq58hz.htm&amp;amp;locale=en" target="_self"&gt;COMPRESS()&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Unfortunately, it does not actually accept modifiers:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
  length str noletters nonumbers $50;
  do str = 'label123', 'α1β2γ3δ', 'ceb131ceb232ceb333ceb434'x;
    putlog 'ORIGINAL: '     str= str=$hex24. ;
    nonumbers = strip(kcompress(str, ,'d'));
    putlog ' No numbers: ' nonumbers= str=$hex24. ;
    noletters = strip(kcompress(str, ,'dk'));
    putlog ' No letters: ' noletters= str=$hex24. ;
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;This throws errors:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;1      nonumbers = strip(kcompress(str, ,'d'));
&lt;FONT color="#FF0000"&gt;                         ---------
                         72&lt;/FONT&gt;
&lt;FONT color="#FF0000"&gt;ERROR 72-185: The KCOMPRESS function call has too many arguments.&lt;/FONT&gt;&lt;/PRE&gt;&lt;P&gt;Brute force still works:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
  length str noletters nonumbers $50;
  do str = 'label123', 'α1β2γ3δ', 'ceb131ceb232ceb333ceb434'x;
    putlog 'ORIGINAL: '     str= str=$hex24. ;
    nonumbers = strip(kcompress(str, '0123456789'));
    putlog ' No numbers: ' nonumbers= str=$hex24. ;
    noletters = strip(kcompress(str, 'abcdefghijklmnopqrstuvwxyz'));
    putlog ' No letters: ' noletters= str=$hex24. ;
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Although even in UTF-8 session, &lt;STRONG&gt;the log cannot display UTF-8 characters,&lt;/STRONG&gt; which is rather unhelpful.&lt;/P&gt;&lt;PRE&gt;ORIGINAL: str=label123 str=6C6162656C31323320202020
 No numbers: nonumbers=label str=6C6162656C31323320202020
 No letters: noletters=123 str=6C6162656C31323320202020
ORIGINAL: str=Î±1Î²2Î³3Î´ str=CEB131CEB232CEB333CEB420
 No numbers: nonumbers=Î±Î²Î³Î´ str=CEB131CEB232CEB333CEB420
 No letters: noletters=Î±1Î²2Î³3Î´ str=CEB131CEB232CEB333CEB420
ORIGINAL: str=Î±1Î²2Î³3Î´4 str=CEB131CEB232CEB333CEB434
 No numbers: nonumbers=Î±Î²Î³Î´ str=CEB131CEB232CEB333CEB434
 No letters: noletters=Î±1Î²2Î³3Î´4 str=CEB131CEB232CEB333CEB434&lt;/PRE&gt;</description>
      <pubDate>Sat, 21 Mar 2020 19:15:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633851#M188066</guid>
      <dc:creator>GGO</dc:creator>
      <dc:date>2020-03-21T19:15:42Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters LANGUAGECONTROL</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633885#M188089</link>
      <description>&lt;P&gt;Excellent summary.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;About:&amp;nbsp;&amp;nbsp;&amp;nbsp; 4. Check that the config file sets: &lt;STRONG&gt;-ENCODING UTF-8&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;This is done by running&lt;FONT face="courier new,courier"&gt; proc options&amp;nbsp;group=languagecontrol; run;&lt;/FONT&gt; &lt;/P&gt;</description>
      <pubDate>Sat, 21 Mar 2020 22:07:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633885#M188089</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-03-21T22:07:01Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633889#M188092</link>
      <description>&lt;P&gt;Nothing buggy or unreliable about the K-functions.&lt;/P&gt;
&lt;P&gt;They support fewer features that's all.&lt;/P&gt;
&lt;P&gt;And it's easy to understand why.&lt;/P&gt;
&lt;P&gt;For example what's a number or a letter in UTF-8?&lt;/P&gt;
&lt;P&gt;Is &lt;SPAN style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial; display: inline !important; float: none;"&gt;δ&lt;/SPAN&gt; a letter? And&lt;FONT size="3"&gt;&amp;nbsp;&lt;SPAN style="color: #3c4043; font-family: arial, sans-serif; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial; display: inline !important; float: none;"&gt;ﻌ&lt;/SPAN&gt; &lt;/FONT&gt;? What about consonant clusters used in Korean, such as&amp;nbsp;&lt;STRONG style="color: #222222; font-family: sans-serif; font-size: 16.8px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #f8f9fa; text-decoration-style: initial; text-decoration-color: initial;"&gt;ㄵ&lt;/STRONG&gt; ?&lt;/P&gt;
&lt;P&gt;Is &lt;SPAN style="color: #333333; font-family: Verdana, Arial, georgia, serif; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial; display: inline !important; float: none;"&gt;٣&lt;/SPAN&gt; a number? or&lt;FONT size="1 2 3 4 5 6 7"&gt; &lt;SPAN style="color: #222222; font-family: sans-serif; font-size: 28px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: center; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #f8f9fa; text-decoration-style: initial; text-decoration-color: initial; display: inline !important; float: none;"&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;፵&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/FONT&gt; ?&lt;/P&gt;
&lt;P&gt;What about Hebrew numbers? Fifteen can be&amp;nbsp;&lt;SPAN style="color: #222222; font-family: 'SBL Hebrew', 'SBL BibLit', 'Frank Ruehl CLM', 'Taamey Frank CLM', 'Ezra SIL', 'Ezra SIL SR', 'Keter Aram Tsova', 'Taamey Ashkenaz', 'Taamey David CLM', 'Keter YG', Shofar, 'David CLM', 'Hadasim CLM', 'Simple CLM', Nachlieli, Cardo, Alef, 'Noto Serif Hebrew', 'Noto Sans Hebrew', 'David Libre', David, 'Times New Roman', Gisha, Arial, FreeSerif, FreeSans; font-size: 16.1px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: -webkit-right; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #f8f9fa; text-decoration-style: initial; text-decoration-color: initial; display: inline !important; float: none;"&gt;ט״ו or י״ה&lt;/SPAN&gt;&amp;nbsp; while the second of these 3 characters is not a number.&lt;/P&gt;
&lt;P&gt;It gets really complicated really quickly when you want to support all the writing systems in the world.&lt;/P&gt;
&lt;P&gt;This explains why for now, some options are left out for the multi-byte functions.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 21 Mar 2020 22:51:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633889#M188092</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-03-21T22:51:33Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633891#M188093</link>
      <description>&lt;P&gt;I suspect that your corrupted log has more to do with your system than with SAS.&lt;/P&gt;
&lt;P&gt;My Linux session has no issue at all.&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="kk.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/37177iE1FB65B3B18F35C0/image-size/medium?v=v2&amp;amp;px=400" role="button" title="kk.png" alt="kk.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt; &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 21 Mar 2020 22:52:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633891#M188093</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-03-21T22:52:53Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633894#M188095</link>
      <description>That log looks like something submitted via Enterprise Guide or SAS/Studio instead from Display Manager.  Notice how the line numbers skip from 1 to 72.</description>
      <pubDate>Sat, 21 Mar 2020 22:42:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633894#M188095</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2020-03-21T22:42:31Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633896#M188097</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt; My log is SAS/Studio on AWS.&lt;/P&gt;
&lt;P&gt;I don't think this matters does it?&lt;/P&gt;
&lt;P&gt;Unless the SAS version for Unicode does not support Unicode (which would be ironic)?&lt;/P&gt;
&lt;P&gt;My assumption is that the OS is at fault here. I could be wrong though.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 21 Mar 2020 22:47:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633896#M188097</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-03-21T22:47:27Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633897#M188098</link>
      <description>DMS is old and not being updated.&lt;BR /&gt;</description>
      <pubDate>Sat, 21 Mar 2020 22:55:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633897#M188098</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2020-03-21T22:55:05Z</dc:date>
    </item>
    <item>
      <title>Re: encoding for proc http to sas7bdat, without corrupting characters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633898#M188099</link>
      <description>&lt;P&gt;The program editor shows the characters correctly,&lt;/P&gt;
&lt;P&gt;Could it just be the font used for the log then?&lt;/P&gt;</description>
      <pubDate>Sat, 21 Mar 2020 23:10:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/encoding-for-proc-http-to-sas7bdat-without-corrupting-characters/m-p/633898#M188099</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-03-21T23:10:14Z</dc:date>
    </item>
  </channel>
</rss>

