08-30-2016 08:11 AM
I am facing problem while importing excel file data with special characters. I am working with SAS 9.1.3 Unix platform and using proc import with dbms = xls statement.
proc import datafile = "patient_data.xls" out = lot_file dbms = xls replace;
getnames = no;
sheet = "lot";;
textsize = 32767;
08-30-2016 08:49 AM
And what IS your problem?
08-30-2016 09:08 AM
Thank you for your response.
The issue is special characters in the data are not imported correctly.
For example: The excel file has data:
"Analysis of Number (%) of Subjects Reporting Ctrough >20 μg/mL at Cycle 5 (Pre dose Cycle 6 ) - Per Protocol Population".
But after importing the μ is replaced with some ¼ .
Please see attached screen shot of the same.
08-30-2016 09:13 AM
Looks like you have a discrepancy between the session encoding of SAS and the encoding of the Excel file (most probably UTF-8).
Either you need to tinker around with the SAS session encoding, or you could correct those characters by using the translate() function.
08-31-2016 08:07 AM
Thank you for your quick response.
My SAS sesson encoding is WLATIN1 and i cant use translate as this code should be working dynamically.
Also i see one article
"The Impact of Change from wlatin1 to UTF-8 Encoding in SAS Environment" suggesting that
"DBMS=XLS does not support the multiple-byte characters! Thus, you cannot create
(or import) via PROC EXPORT/PROC IMPORT a .XLS file with characters like μ in UTF-8 encoding SAS
So is their a way to import XLS files with special characters with SAS 9.1.3 on unix environment ?
08-31-2016 08:48 AM
I handled a similar problem (mainframe ebcdic data with german codepage to UNIX(AIX) with en-US) by creating a macro that contains the translate function for all the German "Umlauts" and applying that to all strings read from the mainframe. After that, a single change to the macro will have an impact in all programs using it, making maintenance quite easy.
Since SAS obviously does not support UTF in the dbms=xls in proc import 8as per your quoted article), I suggest to switch from xls to a more sensible file format. Use CSV or another text-based file format for transfer, and it is that much easier to deal with codepage and multibyte character problems.