Transcoding problem

Accepted Solution Solved
Reply
Contributor
Posts: 73
Accepted Solution

Transcoding problem

I have a csv file with some non-English characters in it. Using data infile in regular SAS, once the foreign character is encountered, the system stops reading in more data and ignores the remainder of the dataset, resulting in a large chunk of the file not imported. I then tried the same code under SAS 9.4 with unicode support, which seems to read in the correct number of rows. The problem is when I try to save it as a dataset on a pre-defined library:

 

"ERROR: Some character data was lost during transcoding in the dataset lib1.dat1. Either the data
contains characters that are not representable in the new encoding or truncation occurred during transcoding."

and 

"WARNING: The data set lib1.data1 may be incomplete. When this step was stopped there were x observations and y variables." x is way less than the number of records in the dataset in the work library.

 

Is this because dataset is currently in unicode and needs to be converted to regular coding somehow before it can be saved and used? Thanks.


Accepted Solutions
Solution
‎08-23-2017 11:52 AM
Super User
Posts: 3,306

Re: Transcoding problem

Posted in reply to apolitical

I had a similar encoding problem recently and solved it this way using the OUTENCODING option:

 

libname inlib 'MyInputLibrary'; 
libname outlib 'MyOutputLibrary' outencoding=asciiany;
proc copy noclone in=inlib out=outlib;
   select MyDataset;
run;

View solution in original post


All Replies
PROC Star
Posts: 331

Re: Transcoding problem

Posted in reply to apolitical

this is an encoding issue. Encoding setting in sas config might defaulted to latin1.  we had problem like this before and we tried to fix it 

 with changing a session option, it did not worked out. What we were suggested at that point of time to change the encoding setting to UTF8 in sas config file.

do a 

proc options option=encoding;    
run; 

I think  you will see latin1 as your encoding.

 

There might be another solution, but I am just sharing my experience.

Contributor
Posts: 73

Re: Transcoding problem

Thanks. Actually I see " ENCODING=UTF-8" because I'm under SAS with unicode support I suppose. I am guessing that the problem of incampatibility arises when I try to save a dataset that's in unicode into a dataset in another coding, although I don't know why SAS would automatically choose to save it in another coding format without asking, maybe that's the default.

Solution
‎08-23-2017 11:52 AM
Super User
Posts: 3,306

Re: Transcoding problem

Posted in reply to apolitical

I had a similar encoding problem recently and solved it this way using the OUTENCODING option:

 

libname inlib 'MyInputLibrary'; 
libname outlib 'MyOutputLibrary' outencoding=asciiany;
proc copy noclone in=inlib out=outlib;
   select MyDataset;
run;
Contributor
Posts: 73

Re: Transcoding problem

Thanks everyone for the advice. I also found adding an encoding option in the data step can work:
data lib1.data1 (encoding=any);
set data1;
run;
PROC Star
Posts: 331

Re: Transcoding problem

Posted in reply to apolitical

please check this below link

 

http://support.sas.com/kb/41/925.html

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 173 views
  • 4 likes
  • 3 in conversation