BookmarkSubscribeRSS Feed
siddhu1
Obsidian | Level 7

Hi,

 

I would like to describe the issue with SAS EG 7.1

When I was trying to execute the SAS Code, I was getting the below error:

 

ERROR: Some character data was lost during transcoding in the dataset DOSDISC.                               Either the data contains characters that are not representable in the new encoding                 or truncation occured during transcoding.

 

Can anyone help on this.

 

Thanks in Advance

Siddu1

7 REPLIES 7
siddhu1
Obsidian | Level 7

Hi,

 

I would like to describe the issue with SAS EG 7.1

When I was trying to execute the SAS Code, I was getting the below error:

 

INFO: Data file RAW.DOSDISC.DATA is in a format that is native to another host, or the file encoding does not match the session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce performance.

 

ERROR: Some character data was lost during transcoding in the dataset RAW.DOSDISC. Either the data contains characters that are not representable in the new encoding or truncation occured during transcoding.

 

Can you help on the resolution front?

 

Thanks & Regards,

Siddu1

SASKiwi
PROC Star

Please post your complete SAS log. Just supplying SAS notes doesn't show us what your code is doing.

Kurt_Bremser
Super User

Looks like you have a dataset with single-byte characters that need to be transcoded to UTF-8 when your SAS session uses that. Since the resulting multi-byte characters need more space, the transcoding leads to truncation of values when the resulting string is longer than the defined length of the variable.

Use a LENGTH statement to increase the defined sizes(s), place it before any SET or MERGE that uses the dataset.

I don't know if this works; it might be that the truncation is done before the data reaches the data step.

ehbales
SAS Employee

Hi,

As Kurt mentions, when migrating data to UTF-8, the transcoding failure message usually indicates a truncation occurred. Some characters in a single-byte encoding require multiple bytes in UTF-8. 

You may find the Character Variable Padding (CVP) libname engine useful. This is a read-only engine that expands the length of character variables in a data set. Options are available to specify how much extra space to add as well as whether to increase the length of the formats that are applied to the data set.

You may find these resources helpful as you troubleshoot the issues:

 

maggiem_sas
SAS Employee

See also Examples: CEDA in the SAS documentation. The examples demonstrate the CEDA messages and provide links to other examples that avoid truncation.

Tom
Super User Tom
Super User

That looks like an error message from your SAS code itself, not from the WIndows application Enterprise Guide.

What version of SAS are you using?  Check the top of the SAS log (if EG  let's you see that) or run some SAS code to ask, like PROC PRODUCT_STATUS.

 

So what encoding is your SAS session using?  You can check the ENCODING option to tell.

What encoding do you think your SAS dataset is using?  You can check using PROC CONTENTS on the dataset.

 

Most likely you just need to make more room.

https://blogs.sas.com/content/sgf/2020/08/12/expanding-lengths-of-all-character-variables-in-sas-dat...

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1150 views
  • 3 likes
  • 6 in conversation