09-07-2013 02:49 AM
Hi, Ladies and Gentlemen:
After I export a SAS data with a lot of text strings to Excel 2010. The Excel reports:
Excel found unreadable content in ... do you want to recover the contents of this workbook?
After I click okay, the Excel fail to repair the contents, and everything is blank (see below)
My SAS code is :
proc export data=bx.xxx outfile='e:\temp\xxx .xlsx' dbms = xlsx replace;
The SAS I'm running is PC SAS for Windows 7 64 bit.
What did I do wrong? Thanks !
09-07-2013 05:26 AM
Is your MS Office install also 64 bit?
Are you able to identify "unreadable" content in the xml file by dumping it into Notepad+ before allowing Excel to attempt recovery?
Does your SAS data contain non standard characters eg as part of company names?
09-07-2013 02:51 PM
Thank you so much for helping out -
Mine is 64 bit win and office. It is likely that the company names contains some non standard characters. but I tried the unicode mode, and it is still the same. When I open the repair log, it says:
<?xml version="1.0" encoding="UTF-8" standalone="true"?>
-<recoveryLog xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"><logFileName>error034800_01.xml</logFileName><summary>Errors were detected in file 'E:\temp\connected_company.xlsx'</summary>-<removedParts summary="Following is a list of removed parts:"><removedPart>Removed Part: /xl/sharedStrings.xml part with XML error. (Strings) Illegal xml character. Line 1, column 203.</removedPart></removedParts>-<removedRecords summary="Following is a list of removed records:"><removedRecord>Removed Records: Cell information from /xl/worksheets/sheet1.xml part</removedRecord></removedRecords></recoveryLog>
I don't know how to move forward....
09-07-2013 07:25 PM
You might be able to narrow it down by importing the "recovered" table back into SAS (under a different name) and using Proc Compare to identify which records/columns are giving you a problem. Proc Compare has a number of options; you might need to play with it a bit to track down where the error is being generated (assuming it is a data driven error rather than a failure to write a readable xml file).
Once you have identified the illegal character(s) a data step using the Translate() function (read the doco - the parameter order is unusual).
09-08-2013 07:02 PM
Is there a way to just get rid of those company names with unconventional characteristics? The thing is that I'm mainly concerned about the companies in the U.S., so those names are not really essential for me; I'm okay with throwing them away. Thank you -
09-09-2013 12:56 AM
If you are sure the problem is in the company names you could use the following where clause to screen out all names which include anything besides standard English letter, numerals and spaces:
where company_name = compress (company_name, ' AD', 'K') ;
(this instructs the compress function to discard every character unless it is a space, English letter, or digit, in this instance the 'K' modifier says 'keep those')
You may have to add in the registered trade mark symbol (small R in a circle) if your companies use it.