BookmarkSubscribeRSS Feed
J_Muth
Calcite | Level 5

Hello,

Does somebody know, how I can set the encoding for reading a table on a MS SQL server via OLE DB?

The table in MSSQL has a wlatin1 encoding.

 

When SAS 9.4 is started with the English locale, it reads the table correctly. Unfortunately there are also other resources that need SAS in UNICODE mode. But that doesn't read the special characters correctly.

 

libname mydblib oledb
init_string="Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=True;Initial Catalog=&dbase;Data Source=&server" schema=dbo;

proc sql;
create table test as
SELECT * FROM mydblib.test;
quit;

 

I tried outencoding = "wlatin1" on the libname statement and encoding = "wlatin1" on the SELECT, but got 

WARNING 278-63: The option OUTENCODING is not implemented in the OLEDB engine.

and 

WARNING: The option ENCODING is not implemented in the OLEDB engine.

 

Is there any possibility to read that wlatin1 table correctly in SAS 9.4 Unicode?

 

Thank you for hints!

15 REPLIES 15
ChrisNZ
Tourmaline | Level 20

Have you tried this

proc sql;
create table test as
SELECT * FROM mydblib.test(encoding='wlatin1');
quit;

 

J_Muth
Calcite | Level 5

Dear Chris!

 

As you can see in the description of my problem: Yes, I did.

And the error message was:

WARNING: The option ENCODING is not implemented in the OLEDB engine.

 

BR,

Juergen

ChrisNZ
Tourmaline | Level 20

Can you show the full log please?

J_Muth
Calcite | Level 5

Thanks for checking. Here is the log...

ChrisNZ
Tourmaline | Level 20

Just like your PDF, this file is off limits.

Do not attach files. Paste everything in line using the appropriate icons.

 

ChrisNZ_1-1615501044911.png

ChrisNZ_2-1615501057779.png

 

J_Muth
Calcite | Level 5
NOTE: Copyright (c) 2016 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) Proprietary Software 9.4 (TS1M6 MBCS3170)
      Licensed to XXXXX.
NOTE: This session is executing on the X64_10PRO  platform.



NOTE: Analytical products:

      SAS/STAT 15.1
      SAS/ETS 15.1
      SAS/IML 15.1

NOTE: Additional host information:

 X64_10PRO WIN 10.0.18362  Workstation

NOTE: SAS initialization used:
      real time           1.10 seconds
      cpu time            0.92 seconds

NOTE: AUTOEXEC processing beginning; file is XXXXX.sas.

NOTE: Libref XXXXX was successfully assigned as follows:
      Engine:        V9
      Physical Name: XXXXX
NOTE: Libref XXXXX was successfully assigned as follows:
      Engine:        V9
      Physical Name: XXXXX

NOTE: AUTOEXEC processing completed.

1    %let server=XXXXX;
2    %let dbase=XXXXX;
3
4    libname mydblib oledb
5    init_string="Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=True;
5  ! Initial Catalog=&dbase;Data Source=&server" outencoding="wlatin1" schema=dbo;
                                                             ---------
                                                             278
NOTE: Libref MYDBLIB was successfully assigned as follows:
      Engine:        OLEDB
      Physical Name:
WARNING 278-63: The option OUTENCODING is not implemented in the OLEDB engine.

6
7    libname mydblib oledb
8    init_string="Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=True;
8  ! Initial Catalog=&dbase;Data Source=&server" schema=dbo;
NOTE: Libref MYDBLIB was successfully assigned as follows:
      Engine:        OLEDB
      Physical Name:

9    proc sql;
10   create table test as
11   SELECT  *
12     FROM mydblib.test(encoding='wlatin1');
WARNING: The option ENCODING is not implemented in the OLEDB engine.
13   quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.20 seconds
      cpu time            0.00 seconds

ChrisNZ
Tourmaline | Level 20

I see. This is less than ideal. Ideas:

-  Look if there are parameters that can be configured with the OLEDB provider. This might involve getting a more modern version. MS has left many dead bodies in the wake of the OLEDB ship. See here. https://docs.microsoft.com/en-us/sql/connect/oledb/oledb-driver-for-sql-server?view=sql-server-ver15

- Use ODBC. Unsure if encoding options are better supported.

- Use the native SAS/Access product. See here https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=acreldb&docsetTarget=p037... Again, unsure if encoding options are better supported.

- Ask the SQL Server admins to copy that table to a UTF-8 database. UTF-8 is the future anyway, so they might as well start the migration.

None of this is ideal, but then you seem to be stuck.

 

J_Muth
Calcite | Level 5
Thank you for the hints. I don't think that this is a problem on the Microsoft side. As I wrote above:

When I start SAS with the English configuration script, it works correctly. But because of the connection to UTF8 encoded sources I want to start it with the u8 configuration.


ODBC is not a solution because of the SingleSignOn approach used in our MSSQL.


And I am not allowed to change the table in the DB.


So the only way to go is to start SAS (Englisch), get the table from the db. Start SAS (Unicode) and proceed...
ChrisNZ
Tourmaline | Level 20

> I don't think that this is a problem on the Microsoft side.

Not saying there is a problem on the MS side, just that there might be useful parameters that you could set.

 

Also, did you see on the MS page whose link I sent that your connection method seems to be unsupported?

If my ideas are not workable, then I see no option other than importing with SAS UTF-8 and reprocessing with SAS wlatin, as you suggest.

Tom
Super User Tom
Super User

@J_Muth wrote:

Hello,

Does somebody know, how I can set the encoding for reading a table on a MS SQL server via OLE DB?

The table in MSSQL has a wlatin1 encoding.

 

When SAS 9.4 is started with the English locale, it reads the table correctly. Unfortunately there are also other resources that need SAS in UNICODE mode. But that doesn't read the special characters correctly.

...


What does "doesn't read the special characters correctly" mean?  Does it translate the non-7bit ascii codes in the database from their representation in the single byte WLATIN1 encoding into the multi-byte UTF-8 encoding?  Did  you remember to include extra length in the character variables to allow the possibility that the new UTF-8 strings will be longer?

J_Muth
Calcite | Level 5
Dear Tom!



Thank you for your reply.

(Sorry, it is not possible to attach a screenshots. I hope, the characters will show correctly in this forum...)



There are 2 problems:

1) Reading data that is already in the DB

2) Writing updates into the DB



For Simulation:
data testline;
length text 20;
* E4 60 B4 F1;
text = "ä ` ´ ñ";
run;

data mydblib.test;
set testline;
run;



Ad 1) With
proc sql noprint;
select text into :check from mydblib.test;
quit;
%put ✓



executed in SAS (English) I get as wanted:

ä ` ´ ñ



The SQL run in SAS (U8) on the same data (not created with SAS (U8)) gives something that I can’t copy paste from the log.

But it looks like:

• ` • • in the log and

� ` � when displaying the table (Yes, there are only two ?, but length is 20!)

An export to a csv gives correct characters.

------------------



Ad 2)

On the other hand, when I append data with SAS (U8),
proc append data=testline base=mydblib.test;

run;

SAS (English) reads:

ä ` ´ ñ from the second line (in the log, when displaying the table and exporting to CSV)



----------------------

So as long as there is no encoding-Option on just that connection I will have to use the workaround of starting two SAS sessions, one for the communication with the MSSQL server (SAS(English)), another for the rest of the system (SAS(U8)). 😞


Tom
Super User Tom
Super User

So does the database contain the WLATIN1 encoding of 'E460B4F1'x for those characters? 

Or the UTF-8 encoding of 'C3A460C2B4C3B'?
Note that the back-tick uses the same '60'x encoding in both.

 

Are you literally placings those non-7bit ascii codes into the text of your SAS program?  Or are you reading them from the files?

J_Muth
Calcite | Level 5
The database contains texts in wlatin1, as 'E460B4F1'x, placed there by another legacy application.


The literals in the snippet were only for demonstration purposes. Those texts are normally imported from other systems (file interface, e.g. csv, xlsx) and I can handle the encoding with options.
Tom
Super User Tom
Super User

So if you read an 'E4'x byte from a WLATIN1 source (your OLEDB database) into a UTF-8 SAS session you would expect the result to be the bytes 'C3A4'x.  Is that what is happening?

And what happens if you try the reverse?  THat is push the hex code 'C3A4'x into the database?  Does it end up as 'E4'x?

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 2603 views
  • 0 likes
  • 3 in conversation