- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have a csv file with some non-English characters in it. Using data infile in regular SAS, once the foreign character is encountered, the system stops reading in more data and ignores the remainder of the dataset, resulting in a large chunk of the file not imported. I then tried the same code under SAS 9.4 with unicode support, which seems to read in the correct number of rows. The problem is when I try to save it as a dataset on a pre-defined library:
"ERROR: Some character data was lost during transcoding in the dataset lib1.dat1. Either the data
contains characters that are not representable in the new encoding or truncation occurred during transcoding."
and
"WARNING: The data set lib1.data1 may be incomplete. When this step was stopped there were x observations and y variables." x is way less than the number of records in the dataset in the work library.
Is this because dataset is currently in unicode and needs to be converted to regular coding somehow before it can be saved and used? Thanks.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I had a similar encoding problem recently and solved it this way using the OUTENCODING option:
libname inlib 'MyInputLibrary';
libname outlib 'MyOutputLibrary' outencoding=asciiany;
proc copy noclone in=inlib out=outlib;
select MyDataset;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
this is an encoding issue. Encoding setting in sas config might defaulted to latin1. we had problem like this before and we tried to fix it
with changing a session option, it did not worked out. What we were suggested at that point of time to change the encoding setting to UTF8 in sas config file.
do a
proc options option=encoding; run;
I think you will see latin1 as your encoding.
There might be another solution, but I am just sharing my experience.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks. Actually I see " ENCODING=UTF-8" because I'm under SAS with unicode support I suppose. I am guessing that the problem of incampatibility arises when I try to save a dataset that's in unicode into a dataset in another coding, although I don't know why SAS would automatically choose to save it in another coding format without asking, maybe that's the default.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I had a similar encoding problem recently and solved it this way using the OUTENCODING option:
libname inlib 'MyInputLibrary';
libname outlib 'MyOutputLibrary' outencoding=asciiany;
proc copy noclone in=inlib out=outlib;
select MyDataset;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data lib1.data1 (encoding=any);
set data1;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content