SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Database Encoding

Reply
Contributor
Posts: 29

Database Encoding

When I use proc append it creates table in UTF-8? In config I have encoding as wlatin1. What is causing proc append to create tables using this encoding? Its a major problem because I then get ERROR: File <filename> cannot be updated because its encoding does not match the session encoding or the file when I append from work (which is in wlatin1).

James

Super User
Posts: 11,343

Re: Database Encoding

Is the BASE data set encoded as UTF-8 BEFORE the append? If you received the data set from someone else that is a likely culprit.

Contributor
Posts: 29

Re: Database Encoding

The base data set is WLatin. Did confirm this encoding. Dataset was pulled using explicit sql pass through from sql server.

PROC Star
Posts: 1,167

Re: Database Encoding

Not sure how to figure out the encoding of a SAS/Access connection...maybe pull it into a WORK dataset, and check that?

Tom

Contributor
Posts: 29

Re: Database Encoding

Oddly when I open SAS proc options encoding says encoding is WLatin1. But after I run program it says encoding is UTF-8. Very very odd seeing as I didn't think you could change option encoding except at SAS execution? Also, no encoding is specified in program.

Super User
Posts: 19,792

Re: Database Encoding

Are you using rsubmit? Running on a server and local?

PROC Star
Posts: 1,167

Re: Database Encoding

If you run

PROC DATASETS;

   CONTENTS DATA=dsn;

RUN;

on all of the datasets participating in your append, it will display the encoding of each one.

Also, you can use PROC OPTIONS OPTION=ENCODING to see what your default is.

Super User
Posts: 19,792

Re: Database Encoding

SAS does appear to use the encoding of the base data set in an append proc, verify your types before the proc append via proc contents and after.

proc contents data=sashelp.class;

run;

data class (encoding=wlatin1);

set sashelp.class;

run;

proc contents data=class;

run;

proc append base=class data=sashelp.class;

run;

proc contents data=class;

run;

Trusted Advisor
Posts: 3,212

Re: Database Encoding

Something to know:
- Every SAS session has a run time encoding, this one is seen with proc options.
- Every SAS dataset has his own encoding setting this one is a SAS dataset setting (since 9 this has changed) 

- Every DBMS has his own encoding with SAS/Access the translation settings are to be configures.
- Eguide is standard having an UTF8 encoding.  This is also common with MS-office html (web) etc.

  For fun most people are not aware of differences with those encodings still working like the 1980's in single-byte (latin) approach. 
- SAS sessions supporting utf8 must need to be started with a dedicated start-scritp. They are difficult to combine with latin1 types although this is possible.

- Utf8 is for a the part of latin-1 for the most chars single-byte exchangeable.

Proc append is not creating a dataset but is combining two different ones.


Questions:
- are the BASE and update dataset for proc append both SAS datasets?

  - When they are both SAS datasets do they have the same encoding.
- are you appending to an exterenal DBMS?

  - What is the encoding of the DBMS and the settings in the dbms client connected to SAS/Access
    The dbms client is the one that will give the encoding to SAS/Access, the RDBMS may have an other encoding.e

Rather sumbersome these topics. For the programming part there is a NLS guide: SAS(R) 9.4 National Language Support (NLS): Reference Guide, Second Edition

---->-- ja karman --<-----
Ask a Question
Discussion stats
  • 8 replies
  • 995 views
  • 0 likes
  • 5 in conversation