DATA Step, Macro, Functions and more

How to check the encoding of an input file

Reply
N/A
Posts: 0

How to check the encoding of an input file

Hello SAS Experts!!!

Merry Christmas to all Smiley Happy

Actually, I am using a data step to read the data from an input file. The file might contain special Unicode encoded data and I am handling it by using Encoding option of a FILENAME statement.

FILENAME nls ":\path\.txt" ENCODING="unicode";

This way, the program handles Unicode pretty well for Unicode saved files.

But I got myself caught in tricky situation when I learned that the input file could be saved using either Unicode as a encoding option or a normal ANSI. Hence first I got to check the type of the input file and then accordingly need to apply encoding option.

Just want to know if is there any file handling function by which we can determine the encoding option used to save the file so that we can use the corresponding encoding option later on.

Please suggest.

Regards
Kapil Agrawal
Super User
Posts: 5,256

Re: How to check the encoding of an input file

Text files is basically a collection of undefined bytes, there are no built in logic that tells the user what encoding was used. To have that logic you will have to go for XML-files instead. So, the format of the text file is decided by the producer, and that knowledge could no be automated by the consumer (the SAS program), at least within SAS to my knowledge.

But I can't see this should be a problem. If you are building a SAS program to handle files, these files would not be created using a random encoding? You could be able to agree on an encoding in a file specification.

/Linus
Data never sleeps
Respected Advisor
Posts: 3,890

Re: How to check the encoding of an input file

I can only agree with Linus that there should be a file specification.

Browsing the Internet a bit it seems that there might be ways to determine (sometimes!) what encoding is used: http://codesnipers.com/?q=node/68

But also a interesting task to determine programmatically the encoding I think it would be the wrong approach. What you need is a defined interface (file specification).

Cheers, Patrick
N/A
Posts: 0

Re: How to check the encoding of an input file

Thanks a lot Patrik and Linus !!!!!

Finally I managed to receive the files in pre determined encoding form only rather than random. Smiley Happy

Thanx a ton Smiley Happy

regards
Kapil Agrawal
Ask a Question
Discussion stats
  • 3 replies
  • 153 views
  • 0 likes
  • 3 in conversation