BookmarkSubscribeRSS Feed
thanikondharish
Fluorite | Level 6

if i run some programs it gives error like see below

ERROR: some character data was lost during transcoding in the dataset main.ss. Either the data contains characters that are not representable in the new encoding or truncation occurred during transcoding.

 

 

 

so how to ignore the above error and get dataset

4 REPLIES 4
Reeza
Super User

Check the encoding of the file and make sure it matches what you've set to import. Or clean up the import file.

 

thanikondharish
Fluorite | Level 6
I can't understand this answer .
Can you give any example?
ballardw
Super User

@thanikondharish wrote:
I can't understand this answer .
Can you give any example?

Encoding is related to the character set used to store the data. Since computers really only speak binary there are patterns of 1 and 0 to indicate the letters, digits and characters that humans use. Depending on the language some characters require more 1 and 0's to store the characters. ASCII and EBCIDIC are two very early coding schemes designed by English speakers. English only uses 52 "letters" of Roman alphabet decent, 26 each of upper and lower case. Other languages add accent marks, umlauts, and a collection of diacritic marks added to the Roman to indicate other sounds. So the need more codes. Other languages, Chinese, Japanese, Hindi for example have many more characters. Encoding involves the rules for turning the proper number of 1 and 0 into the correct characters.

 

Files that contain text may be a number of options though UTF-8, UTF-16 and UTF-32 are common ones for international sourced files. The part you are concerned with is that your SAS system needs to know how to use the information. If your data set was build with UTF-16 but your default is UTF-8 or ASCII or EBCIDIC then the 1 and 0 are "wrong" from your SAS point of view and doesn't understand how to actually read or use the data. You can specify encoding on a LIBNAME statement or as a dataset option.

 

But you need to have some idea what the file contains.

 

You might be able to use your data by adding an encoding=option. I would try UTF-8, UTF-16 and UTF-32 in that order to see if one of them works.

 

Something like this will attempt to print the first 5 records using UTF-8 encoding.

proc print data=main.ss (encoding=UTF-8 obs=5);

run;

 

If one of those works then you need to add the Encoding= data set option when using that set (and possibly others made from it).

If those don't work then check the documentation for the lists of other encodings that may be valid.

LinusH
Tourmaline | Level 20

1. Check what encoding your SAS session is using.
2. Check what encoding your data has. If a SAS data set use PROC contents.
What next depends on the outcome of the above. Read through the NLS documentation, there are some guidance there, but even I can find these kind if issues hard to understand and solve.

Data never sleeps

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1305 views
  • 1 like
  • 4 in conversation