BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
apolitical
Obsidian | Level 7

I have a csv file with some non-English characters in it. Using data infile in regular SAS, once the foreign character is encountered, the system stops reading in more data and ignores the remainder of the dataset, resulting in a large chunk of the file not imported. I then tried the same code under SAS 9.4 with unicode support, which seems to read in the correct number of rows. The problem is when I try to save it as a dataset on a pre-defined library:

 

"ERROR: Some character data was lost during transcoding in the dataset lib1.dat1. Either the data
contains characters that are not representable in the new encoding or truncation occurred during transcoding."

and 

"WARNING: The data set lib1.data1 may be incomplete. When this step was stopped there were x observations and y variables." x is way less than the number of records in the dataset in the work library.

 

Is this because dataset is currently in unicode and needs to be converted to regular coding somehow before it can be saved and used? Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

I had a similar encoding problem recently and solved it this way using the OUTENCODING option:

 

libname inlib 'MyInputLibrary'; 
libname outlib 'MyOutputLibrary' outencoding=asciiany;
proc copy noclone in=inlib out=outlib;
   select MyDataset;
run;

View solution in original post

5 REPLIES 5
kiranv_
Rhodochrosite | Level 12

this is an encoding issue. Encoding setting in sas config might defaulted to latin1.  we had problem like this before and we tried to fix it 

 with changing a session option, it did not worked out. What we were suggested at that point of time to change the encoding setting to UTF8 in sas config file.

do a 

proc options option=encoding;    
run; 

I think  you will see latin1 as your encoding.

 

There might be another solution, but I am just sharing my experience.

apolitical
Obsidian | Level 7

Thanks. Actually I see " ENCODING=UTF-8" because I'm under SAS with unicode support I suppose. I am guessing that the problem of incampatibility arises when I try to save a dataset that's in unicode into a dataset in another coding, although I don't know why SAS would automatically choose to save it in another coding format without asking, maybe that's the default.

SASKiwi
PROC Star

I had a similar encoding problem recently and solved it this way using the OUTENCODING option:

 

libname inlib 'MyInputLibrary'; 
libname outlib 'MyOutputLibrary' outencoding=asciiany;
proc copy noclone in=inlib out=outlib;
   select MyDataset;
run;
apolitical
Obsidian | Level 7
Thanks everyone for the advice. I also found adding an encoding option in the data step can work:
data lib1.data1 (encoding=any);
set data1;
run;
kiranv_
Rhodochrosite | Level 12

please check this below link

 

http://support.sas.com/kb/41/925.html

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 20177 views
  • 5 likes
  • 3 in conversation