BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
apolitical
Obsidian | Level 7

I have a csv file with some non-English characters in it. Using data infile in regular SAS, once the foreign character is encountered, the system stops reading in more data and ignores the remainder of the dataset, resulting in a large chunk of the file not imported. I then tried the same code under SAS 9.4 with unicode support, which seems to read in the correct number of rows. The problem is when I try to save it as a dataset on a pre-defined library:

 

"ERROR: Some character data was lost during transcoding in the dataset lib1.dat1. Either the data
contains characters that are not representable in the new encoding or truncation occurred during transcoding."

and 

"WARNING: The data set lib1.data1 may be incomplete. When this step was stopped there were x observations and y variables." x is way less than the number of records in the dataset in the work library.

 

Is this because dataset is currently in unicode and needs to be converted to regular coding somehow before it can be saved and used? Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

I had a similar encoding problem recently and solved it this way using the OUTENCODING option:

 

libname inlib 'MyInputLibrary'; 
libname outlib 'MyOutputLibrary' outencoding=asciiany;
proc copy noclone in=inlib out=outlib;
   select MyDataset;
run;

View solution in original post

5 REPLIES 5
kiranv_
Rhodochrosite | Level 12

this is an encoding issue. Encoding setting in sas config might defaulted to latin1.  we had problem like this before and we tried to fix it 

 with changing a session option, it did not worked out. What we were suggested at that point of time to change the encoding setting to UTF8 in sas config file.

do a 

proc options option=encoding;    
run; 

I think  you will see latin1 as your encoding.

 

There might be another solution, but I am just sharing my experience.

apolitical
Obsidian | Level 7

Thanks. Actually I see " ENCODING=UTF-8" because I'm under SAS with unicode support I suppose. I am guessing that the problem of incampatibility arises when I try to save a dataset that's in unicode into a dataset in another coding, although I don't know why SAS would automatically choose to save it in another coding format without asking, maybe that's the default.

SASKiwi
PROC Star

I had a similar encoding problem recently and solved it this way using the OUTENCODING option:

 

libname inlib 'MyInputLibrary'; 
libname outlib 'MyOutputLibrary' outencoding=asciiany;
proc copy noclone in=inlib out=outlib;
   select MyDataset;
run;
apolitical
Obsidian | Level 7
Thanks everyone for the advice. I also found adding an encoding option in the data step can work:
data lib1.data1 (encoding=any);
set data1;
run;
kiranv_
Rhodochrosite | Level 12

please check this below link

 

http://support.sas.com/kb/41/925.html

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 21146 views
  • 5 likes
  • 3 in conversation