SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Encoding error during xlsx import: Explain to me like I'm a fifth grader

Reply
New Contributor
Posts: 4

Encoding error during xlsx import: Explain to me like I'm a fifth grader

Hey guys,

 

I'm brand new to SAS and have limited experience, and the problem that I'm having seems way over my head.  I'm trying to import data from an excel file into SAS, but during a my data step I get an error about encoding.  Here's the script that I'm running:

 

libname surveyx pcfiles path='C:\Users\au551418\Desktop\Learning SAS\AUWork\Survey\Survey.xlsx';                                      

data work.survset;                                                                                                                      

        set surveyx.’Data$’n;                                                                                                                    

run;

 

The surveyx library successfully is assigned each time, but when it comes to the set function, I always get this error:

 

ERROR: Unable to transcode data to/from UCS-2 encoding

 

Now, I don't know anything about encoding, and have tried to look up on previous forum posts what in the world to do.  I have tried a lot of stuff, but none of it has worked, and many of the posts are not xlsx files (which I have no idea if that matters or not).  Can anyone help me in a simplistic way?  Thanks!

Super User
Super User
Posts: 8,181

Re: Encoding error during xlsx import: Explain to me like I'm a fifth grader

To display characters to the screen a computer uses an encoding table, so for instance most tables contain the western alphabet, lower and upper, and they have their codes associated.  If you replace the encoding table, the same codes refer to different symbols.  This is how a computer a) stores information - a combination of what encoding table, and the code number, and b) represents it on screen, by drawing the associated character from the table.

What the error is telling you is that there is an encoding table found that it cannot switch to (this is part of National Language Support in SAS, plenty of docs on it).  Does the Excel file open fine in Excel?  If so then I would suggest saving it to CSV and reading that in, however do be aware that you may be missing some information - you will see odd characters on screen in Excel.

I personally have never heard of a UCS-2 encoding format, nor can I find much information about it from a google search, so I would really go back to the source of that file and find out what is going on there, get them to supply it in a standard encoding like UTF.

New Contributor
Posts: 4

Re: Encoding error during xlsx import: Explain to me like I'm a fifth grader

Yea, got encoding, just unfamiliar with SAS more Smiley Happy   As far as Excel, it opens easily and without issue.  I don't see any unusual characters, but there could be something hidden in there.

 

I tried saving a copy of the excel file as a csv and then importing with this code:

 

libname surveyx pcfiles path='C:\Users\au551418\Desktop\Learning SAS\AUWork\Survey\Copy of Survey.csv';
data work.survset;
set surveyx.'Data$'n;
run;

 

But now I get this Error code:

 

ERROR: Unable to determine file type from PATH= option.
ERROR: Error in the LIBNAME statement.

 

Do I need to change some format to import csv's? 

 

Super User
Super User
Posts: 8,181

Re: Encoding error during xlsx import: Explain to me like I'm a fifth grader

If you save to CSV - which is a text file with data delimited by commas - you do not read it in by using Excel specific libname statements.  You read it in as a text file.  To get you started you can get most of the code by doing:

proc import datafile="yourcsv.csv" out=want dbms=csv;
run;

Then in the log it will show you a large datastep import.  This will consist of a file statement, an input statement, plus others like format.  This is how you read in plain text files, and you can take this code and alter it to read in the data anyway you like, say changing lengths, or reading in dates in a specific format.  Its the most powerful and controlable way to read data.

This talks about tab delimited, but same thing for CSV:

https://www.sas.com/content/dam/SAS/support/en/technical-papers/ts673.pdf

New Contributor
Posts: 4

Re: Encoding error during xlsx import: Explain to me like I'm a fifth grader

Perfect, thank you!  Got this to run and data successfully imported.  Don't understand the whole UCS-2 formatting thing, but guess we bypassed that Smiley Happy  Thanks for your help!

Community Manager
Posts: 3,024

Re: Encoding error during xlsx import: Explain to me like I'm a fifth grader

Regarding the UCS-2 encoding, this SAS Note explains that you can get that error when running a DBCS version of SAS.  If you need support for national characters including multibyte characters as you might find in Japan or Chinese languages, then SAS advises using the UTF-8 mode instead.  The SAS Note describes this approach.

 

Just a bit of history: SAS has supported a DBCS version for many years -- with the "DB" meaning SAS requires 2 bytes per char.  However, in the past several years Unicode -- and it's lighter cousin UTF-8 -- has emerged as the preferred standard for encoding such data -- characters occupy 1-4 bytes depending on the char in question.  Personally, I use SAS in UTF-8 configuration for almost all of my jobs now.

New Contributor
Posts: 4

Re: Encoding error during xlsx import: Explain to me like I'm a fifth grader

Posted in reply to ChrisHemedinger

Yea, I saw those links before, but they really didn't make any sense to me.  If I wanted to convert to UTF-8, what code do I write to do?  Or do I have to change the excel file? 

 

I didn't find anything similar to what was written there:

 

"C:\Program Files\SAS\SASFoundation\9.2\sas.exe" –CONFIG "C:\Program Files\SAS\SASFoundation\9.2\nls\u8\SASV9.CFG"

 

Don't know what I'm supposed to do about that??

 

Community Manager
Posts: 3,024

Re: Encoding error during xlsx import: Explain to me like I'm a fifth grader

If you are running with SAS on your own PC, the easiest thing to do is to change your SASHOME/SASFoundation/9.4/sasv9.cfg file to reference the /u8/sasv9.cfg file.

 

If using a SAS session that is maintained by an admin, you'll have to ask your admin for access to a session that's running in UTF-8.

Ask a Question
Discussion stats
  • 7 replies
  • 182 views
  • 1 like
  • 3 in conversation