BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rogersaj
Obsidian | Level 7

I know others have posted this same question, but even after reading the other posts and talking to SAS support, I can't solve the problem. 

 

Here's my code:

 

libname csl 'C:\Users\rogersaj\Documents\CSL';
proc datasets library=csl nodetails nolist;
modify csl1/ correctencoding='utf8';
quit;
data fun; 
set csl.csl1;
run;

Here's my log:

 

1    libname csl 'C:\Users\rogersaj\Documents\CSL';
NOTE: Libref CSL was successfully assigned as follows:
      Engine:        V9
      Physical Name: C:\Users\rogersaj\Documents\CSL
2    proc datasets library=csl nodetails nolist;
NOTE: Writing HTML Body file: sashtml.htm
3    modify csl1/ correctencoding='utf8';
WARNING: CORRECTENCODING was successful. MODIFY RUN group closed because the ENCODING value
         does not match the session encoding value.
4    quit;

NOTE: PROCEDURE DATASETS used (Total process time):
      real time           0.50 seconds
      cpu time            0.34 seconds


5    data fun;
6        set csl.csl1;
NOTE: Data file CSL.CSL1.DATA is in a format that is native to another host, or the file
      encoding does not match the session encoding. Cross Environment Data Access will be
      used, which might require additional CPU resources and might reduce performance.
WARNING: Some character data was lost during transcoding in the dataset CSL.CSL1. Either the
         data contains characters that are not representable in the new encoding or truncation
         occurred during transcoding.
7    run;

ERROR: Some character data was lost during transcoding in the dataset WORK.FUN. Either the
       data contains characters that are not representable in the new encoding or truncation
       occurred during transcoding.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.FUN may be incomplete.  When this step was stopped there were 0
         observations and 805 variables.
NOTE: DATA statement used (Total process time):
      real time           0.05 seconds
      cpu time            0.04 seconds

 

Please help? Thank you!!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Make sure that you are running SAS with UTF8 encoding.  How to do that depends on how you are running SAS.  If you are using interactive SAS with Display Manager on Windows then your installation process should have created Start menu commands to launch SAS using different encodings.  On my machine they set it up with one icon labeled "SAS 9.4 (English)" that uses WLATIN1 and another labeled "SAS 9.4 (Unicode support)" that uses UTF-8.  Similarly on Unix they setup separate commands to launch SAS with and without UTF-8 support. If you are using Enterprise Guide or SAS/Studio the choice is made by which application server you connect to.

 

Once you are running SAS with the right encoding then you should be able to read any of the other dataset.  If you are running with ENCODING='UTF-8' then you should be able to read data that uses any other encoding.  But you might not be able to go the other way.  So it might be best to just always use UTF-8.

 

Then you can just re-create the dataset.  You could just run a data step to copy it, but I like to use PROC COPY so that I can move a whole library at once.  But make sure the add the NOCLONE option so that it actually re-builds the dataset instead of just copying them.  I also like to use the DATECOPY option so that the metadata reflects when the original file was created.

 

libname in 'folder with foreign datasets';
libnmae out 'new folder for local datasets';
proc copy inlib=in outlib=out noclone datecopy ;
run;

 

 

View solution in original post

4 REPLIES 4
Tom
Super User Tom
Super User

What are you trying to do?

I sort of looks like you are trying to change the encoding of the characters in a dataset?

What encoding was used to generate the original dataset?

What encoding is your current SAS session using?

What encoding to your want the new version of the dataset to use?

 

It looks like you are trying to read a dataset made using UTF8 encoding into a session that is using a single byte encoding like WLATIN1 then it is very likely there are characters represented in the UTF8 encoding system that cannot be represented by a single byte in WLATIN1.  So don't do that.

 

Even if you get all of that correct it still might be possible that the actual values in the data are invalid strings.  For example the original encoding might have been in UTF8 or something that allow multiple bytes per character and the data has gotten truncated when saved into the SAS dataset such that the last character is missing the final few bytes that would fully define it.

rogersaj
Obsidian | Level 7

I really just want to set my dataset in my WORK library and analyze it without getting errors and warnings. My main concern is this line and the fact that it will say an inordinately small number of observations are in the dataset (When I know there are many, many observations).

Data file CSL.CSL1.DATA is in a format that is native to another host, or the file
      encoding does not match the session encoding.

 

Thanks for your comments on the UTF8 encoding. I took this part (which was a "solution" that I found on another person's post regarding the same error) out of my code.

proc datasets library=csl nodetails nolist;
modify csl1/ correctencoding='utf8';
quit;

 I'm still getting the same errors. 

 

libname csl 'C:\Users\Anna Joy Rogers\Documents\CSL';
proc contents data =csl.csl1; run;
proc freq data =csl.csl1;
tables ab_any;
run;

 

4 proc contents data =csl.csl1; run;
NOTE: Writing HTML Body file: sashtml.htm
NOTE: Data file CSL.CSL1.DATA is in a format that is native to another host, or the file
encoding does not match the session encoding. Cross Environment Data Access will be
used, which might require additional CPU resources and might reduce performance.

WARNING: Some character data was lost during transcoding in the dataset CSL.CSL1. Either the
data contains characters that are not representable in the new encoding or truncation
occurred during transcoding.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 1.36 seconds
cpu time 0.54 seconds


5 proc freq data =csl.csl1;
NOTE: Data file CSL.CSL1.DATA is in a format that is native to another host, or the file
encoding does not match the session encoding. Cross Environment Data Access will be
used, which might require additional CPU resources and might reduce performance.
6 tables ab_any;
WARNING: Some character data was lost during transcoding in the dataset CSL.CSL1. Either the
data contains characters that are not representable in the new encoding or truncation
occurred during transcoding.
7 run;

NOTE: There were 1 observations read from the data set CSL.CSL1.
NOTE: PROCEDURE FREQ used (Total process time):
real time 2.26 seconds
cpu time 0.90 seconds

 

 

Thanks so much for your time!

Tom
Super User Tom
Super User

Make sure that you are running SAS with UTF8 encoding.  How to do that depends on how you are running SAS.  If you are using interactive SAS with Display Manager on Windows then your installation process should have created Start menu commands to launch SAS using different encodings.  On my machine they set it up with one icon labeled "SAS 9.4 (English)" that uses WLATIN1 and another labeled "SAS 9.4 (Unicode support)" that uses UTF-8.  Similarly on Unix they setup separate commands to launch SAS with and without UTF-8 support. If you are using Enterprise Guide or SAS/Studio the choice is made by which application server you connect to.

 

Once you are running SAS with the right encoding then you should be able to read any of the other dataset.  If you are running with ENCODING='UTF-8' then you should be able to read data that uses any other encoding.  But you might not be able to go the other way.  So it might be best to just always use UTF-8.

 

Then you can just re-create the dataset.  You could just run a data step to copy it, but I like to use PROC COPY so that I can move a whole library at once.  But make sure the add the NOCLONE option so that it actually re-builds the dataset instead of just copying them.  I also like to use the DATECOPY option so that the metadata reflects when the original file was created.

 

libname in 'folder with foreign datasets';
libnmae out 'new folder for local datasets';
proc copy inlib=in outlib=out noclone datecopy ;
run;

 

 

rogersaj
Obsidian | Level 7

Thank you so very much.

 

SAS had been installed on my new machine and by default was using SAS 9.4 (English with DCBS). Now that I've changed it to SAS 9.4 (Unicode support), my issue has been solved.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 43613 views
  • 4 likes
  • 2 in conversation