BookmarkSubscribeRSS Feed
siddhu1
Quartz | Level 8

Hi,

 

I would like to describe the issue with SAS EG 7.1

When I was trying to execute the SAS Code, I was getting the below error:

 

ERROR: Some character data was lost during transcoding in the dataset DOSDISC.                               Either the data contains characters that are not representable in the new encoding                 or truncation occured during transcoding.

 

Can anyone help on this.

 

Thanks in Advance

Siddu1

7 REPLIES 7
siddhu1
Quartz | Level 8

Hi,

 

I would like to describe the issue with SAS EG 7.1

When I was trying to execute the SAS Code, I was getting the below error:

 

INFO: Data file RAW.DOSDISC.DATA is in a format that is native to another host, or the file encoding does not match the session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce performance.

 

ERROR: Some character data was lost during transcoding in the dataset RAW.DOSDISC. Either the data contains characters that are not representable in the new encoding or truncation occured during transcoding.

 

Can you help on the resolution front?

 

Thanks & Regards,

Siddu1

SASKiwi
PROC Star

Please post your complete SAS log. Just supplying SAS notes doesn't show us what your code is doing.

Kurt_Bremser
Super User

Looks like you have a dataset with single-byte characters that need to be transcoded to UTF-8 when your SAS session uses that. Since the resulting multi-byte characters need more space, the transcoding leads to truncation of values when the resulting string is longer than the defined length of the variable.

Use a LENGTH statement to increase the defined sizes(s), place it before any SET or MERGE that uses the dataset.

I don't know if this works; it might be that the truncation is done before the data reaches the data step.

ehbales
SAS Employee

Hi,

As Kurt mentions, when migrating data to UTF-8, the transcoding failure message usually indicates a truncation occurred. Some characters in a single-byte encoding require multiple bytes in UTF-8. 

You may find the Character Variable Padding (CVP) libname engine useful. This is a read-only engine that expands the length of character variables in a data set. Options are available to specify how much extra space to add as well as whether to increase the length of the formats that are applied to the data set.

You may find these resources helpful as you troubleshoot the issues:

 

maggiem_sas
SAS Employee

See also Examples: CEDA in the SAS documentation. The examples demonstrate the CEDA messages and provide links to other examples that avoid truncation.

Tom
Super User Tom
Super User

That looks like an error message from your SAS code itself, not from the WIndows application Enterprise Guide.

What version of SAS are you using?  Check the top of the SAS log (if EG  let's you see that) or run some SAS code to ask, like PROC PRODUCT_STATUS.

 

So what encoding is your SAS session using?  You can check the ENCODING option to tell.

What encoding do you think your SAS dataset is using?  You can check using PROC CONTENTS on the dataset.

 

Most likely you just need to make more room.

https://blogs.sas.com/content/sgf/2020/08/12/expanding-lengths-of-all-character-variables-in-sas-dat...

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1912 views
  • 3 likes
  • 6 in conversation