10-19-2016 05:16 PM
I am trying to import an xlsx document with a cell containing a string that includes "greater than" (" ≥") sign. I am getting the following note "WARNING: Some character data was lost during transcoding in column: MYVARIABLE at obs 18.". Below is the full code. How can I remove this warning? I don't have the option to change the content or type (xlsx) of the spreadsheet. I am working in SAS 9.4 in Enterprise Guide.
%let specs = /path/myspreadsheet.xlsx;
10-19-2016 06:59 PM
You may be able to use the LIBNAME Excel pointed to your source.
then instead of import you would have
where lib is the library name you assign and "sheet" is one of the data sets in the library.
But you may still have an issue if the cause is character encoding. Not changing the content or file type (I assume you mean you can't save as CSV and see if that will import) may mean that you'll never get rid of warning.
10-20-2016 01:50 AM
I don't have the option to change the content or type (xlsx) of the spreadsheet.
Of course you have the option. Even if you don't have Excel installed, LibreOffice is available on all common platforms and can be used to convert the excel dungheap into a nice .csv.
Excel files are NOT, I repeat, NOT suited for proper transfer of tabular data.
10-20-2016 11:01 AM
Thanks for your replies.Yes, the warning is caused by a character encoding issue. The original text in the xlsx file is "score ≥0.5" and the imported text in my work sas file is "score 0.5". P
And I want to clarify I do have Microsoft Excel in my computer. (sorry, did not explain myself well). What I cannot do is modify the xlsx file. This is because the instruction I was given is to import that particular file.
10-20-2016 11:23 AM
I just had to go through this wonderful process...what a pain!
No guarantees, but hopefully this will get you going.
First, you need to figure out how your spreadsheet is encoded. My source data wasn't a spreadsheet, but I think this will work. In Excel, do File | Save As | click More options | click Tools | click Web options | click Encoding. Hopefully, this will show you what your spreadsheed is stored with.
Once I knew this, and had my data imported, running this fixed my up (in my case the data was UTF-8).
/* Set the encoding of the dataset to utf-8 */
proc datasets lib=work nolist;
modify indata / CORRECTENCODING='utf-8';