- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm exporting using PROC EXPORT on SAS 9.4 UNIX. The SAS logs says the file was created successfully. However when trying to open it in Excel 2010 SP2 it says it has found unreadable content. Is there any workaround for this? I need to export to Excel and not a text file because I need to keep data types consistent.
edit:
Downloaded the SAS dataset and tried repeating the process on PC SAS 9.4 with the same results.
edit2:
Wound up just using pandas. I'm thinking it might have been an encoding issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Using Excel to keep data types consistent?! This sounds like a good oxymoron to me.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What I mean is I don't want to export it to a CSV then have to go into each column in Excel and change the data type to the appropriate one.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
How are you moving the files?
Did you set the transfer type to binary?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
scp through a SSH connection. I'm transferring it as a binary file.
edit: Let me try downloading it with a different program and see if that's it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm assuming this happens regardless if you create the file with PROC EXPORT or ODS EXCEL?
You should also show your code...
To narrow down the problem, does a basic export work, ie If you run the code below and move the file are there any issues?
If not, it may have something to do with your data.
proc export data=sashelp.class outfile='/folders/myfolders/myfile.xlsx' dbms=xlsx replace;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The export code looks like this:
PROC EXPORT DATA=MYSASDATASET OUTFILE='/some/path/file.xlsx' DBMS=XLSX REPLACE; RUN;
The code you pasted works with the original process I was using. I'm getting a non-corrupted Excel file.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So what's in your data that's problematic?
Any invisible characters, special characters or otherwise formats that may cause issues?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I thought it might be unprintable characters because I'm working with some sloppy data. However after going through each character column and using this:
BAD_COLUMN = PRXCHANGE('s/[^\x20-\x7F]//', -1, BAD_COLUMN);
i.e. only allowing ASCII characters, I still get a corrupted export.
No special formats.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
How many records and variables are we talking about?
If there are not a lot of records I might be tempted to print the character variables, likely to an RTF document, and see if anything looks funny in the output.
Or if you have a few longish text fields just print those as longer text seems to have more room for problematic data entry/capture issues.
If the set has a lot of records then perhaps you could get lucky looking at a couple of hundred lines.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Binary search?
TBH I don't know what the issue would be so guess and test is likely the quickest approach here.
Take approximately half your variables and see if the export works. If it does, check the next half, if it doesn't work, then take another half and repeat until you find the issue.
You can select a range of variables in a specific order using a -- b where a is the first variable and b is the last variable in the list.
You can also select only character - since this is likely to be a character issue with:
a-character-b
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
OK, that sounds like a good strategy. Thanks for all the help everyone.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I was able to use this function on all character variables and no longer got the corrupt xlsx file. I read in an Excel sheet (my source which had query response text from sites all over the world), manipulated & summarized the data, then needed to output it back to a workbook with other sheets.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Just as a side note: The PRX...() functions don't work for multi-byte characters and though your test might be invalid.