DATA Step, Macro, Functions and more

Solution for ERROR: Some character data was lost during transcoding in the dataset

Accepted Solution Solved
Reply
Contributor
Posts: 22
Accepted Solution

Solution for ERROR: Some character data was lost during transcoding in the dataset

When I search for the mentioned error I get the link solution provided adminstration oriented but developers do not have access. 

 

  http://support.sas.com/kb/52/716.html

 

Please see my solution below

 

proc options option=config; run;

proc options group=languagecontrol; run;

 

/* Show the encoding value for the problematic data set */

%let dsn=item_information_may16;

%let dsid=%sysfunc(open(&dsn,i));

%put &dsn ENCODING is: %sysfunc(attrc(&dsid,encoding));

 

 

 

/*Renaming item desc file  (encoding=any) allowed reading****************************/

 

data tmp.item_info_curr;

set hc.item_information_may16 (encoding=any);

run;

 

/*gave error*/

data item_info_curr;

set item_information_may16;

run;

 


Accepted Solutions
Solution
a week ago
SAS Employee
Posts: 8

Re: Solution for ERROR: Some character data was lost during transcoding in the dataset

[ Edited ]

By default, the configuration file for the configuration called "English with DBCS Support" sets the ENCODING (SAS session encoding) to an encoding that supports Japanese. The Hindi characters are not supported by the Japanese encoding.

 

If you use the configuration that is created under the "nls/u8" directory, that will set the SAS session encoding to UTF-8. The Hindi characters are supported there. 

 

If you are already using UTF-8, then try increasing the multiplier of the CVP engine. Hindi characters require 3 bytes per character. You probably need to set the CVPMULTIPLIER= option to 3.

 

For more information, check this comprehensive article:

You'll learn about:

View solution in original post


All Replies
Super User
Posts: 9,681

Re: Solution for ERROR: Some character data was lost during transcoding in the dataset


You could try this code to avoid these ERROR.


libname xx cvp '/folders/myfolders/';
proc copy in=xx out=work noclone;
 select class;
run;

Contributor
Posts: 49

Re: Solution for ERROR: Some character data was lost during transcoding in the dataset

[ Edited ]

@Ksharp

I tried below code but still, I got the same error

 

"Some character data was lost during transcoding in the dataset XX.AA. Either the data
contains characters that are not representable in the new encoding or truncation
occurred during transcoding."

libname xx cvp '/folders/myfolders/';
proc copy in=xx out=work noclone;
 select class;
run;

 what should I do to read Hindi Language words in SAS 9.4 (English with DBCS)?

Solution
a week ago
SAS Employee
Posts: 8

Re: Solution for ERROR: Some character data was lost during transcoding in the dataset

[ Edited ]

By default, the configuration file for the configuration called "English with DBCS Support" sets the ENCODING (SAS session encoding) to an encoding that supports Japanese. The Hindi characters are not supported by the Japanese encoding.

 

If you use the configuration that is created under the "nls/u8" directory, that will set the SAS session encoding to UTF-8. The Hindi characters are supported there. 

 

If you are already using UTF-8, then try increasing the multiplier of the CVP engine. Hindi characters require 3 bytes per character. You probably need to set the CVPMULTIPLIER= option to 3.

 

For more information, check this comprehensive article:

You'll learn about:

Super User
Posts: 9,681

Re: Solution for ERROR: Some character data was lost during transcoding in the dataset


Yeah. That is ENCODING problem.
Change your sas session encoding to be the same encoding

SAS Employee
Posts: 8

Re: Solution for ERROR: Some character data was lost during transcoding in the dataset

The cause of the error may be truncation rather than transcoding. If your SAS session encoding is UTF-8 and your data set encoding is another encoding, the error may be telling you that the length of one or more character variables is not long enough to hold the UTF-8 version of a string. Some characters require more bytes in UTF-8 than they did in the "legacy" encodings that SAS supports.

 

If that is the scenario you have, the CVP engine may be helpful to you. Several resources are available that you may find useful:

 

The National Language Reference Guide for 9.4 has some sample code showing how to use CVP. See the section "Avoiding Character Data Truncation by Using the CVP Engine" in the "Transcoding in NLS" chapter of the NLS Concepts section.

 

The white paper titled Multilingual Computing with SAS 9.4 also discusses CVP and other issues related to working with multilingual data. The paper was written for 9.4, but many of the features documented there are available with earlier releases of SAS 9.

http://support.sas.com/resources/papers/Multilingual_Computing_with_SAS_94.pdf 

Super User
Super User
Posts: 6,500

Re: Solution for ERROR: Some character data was lost during transcoding in the dataset

[ Edited ]

The link in your post doesn't work. But if I copy the text of the link and paste it into browser then it does work.

 

http://support.sas.com/resources/papers/Multilingual_Computing_with_SAS_94.pdf

Let me see if I can recreate the link to see why it didn't work in your post.

http://support.sas.com/resources/papers/Multilingual_Computing_with_SAS_94.pdf

 

So the link here works. I think in your post the URL had an extra space in it.

SAS Employee
Posts: 8

Re: Solution for ERROR: Some character data was lost during transcoding in the dataset

Thanks for pointing out the bad link! Sorry for the inconvenience.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 2596 views
  • 2 likes
  • 5 in conversation