- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am importing some xlsx sheet into SAS but for some reason the value 0.07 is being imported incorrectly. The document is mixed and all variables are being imported in as character but for some reason anytime there is the value 0.07 SAS is importing it in as the string value 7.0000000000000007E-2.
Any ideas why and how to correct this?
I know I could ad an if statement but I would rather correct it at the import step or understand why SAS is doing what it's doing.
Also touching/converting the original documents isn't an option.
Using SAS 9.4
Thanks
proc import file="&&dirfn&i" out=Xl&i
DBMS=xlsx replace;
getnames=NO;
run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That number you show is not 0.07 as you previously stated, it is a large number. Any reason why there is all those extra 0's in place, I suspect at some point in the creation of the data there has been a calculation which has created a tiny fraction, you can find some examples:
Excel likely has a flag in the background, or keeps the extra bit in a special field - all reasons why Excel should not be used.
Its a question that often comes up, and for me Excel is not something which can be used in any form of data capture/storage/transfer due to these types of things. Hardcoding changes in your program isn't really GxP either. If I was forced into doing this, I would either get management written approval to apply the hardcoding, or convert to CSV, QC, and import that.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Why isn't converting the original document an option? Its most likely Excel causing the problem in the first instance, it does a good job of hiding most things. If you saveas to a new file (i.e. not modifying the original) then look at say the CSV file which is created, look at the cells with 0.07 and see if they are correct in the CSV file. If so then write a simple datastep import to import the file correctly. This would always be my go for reproducibility, control, and accuracy.
I would guess that the fields with 0.07 displaying either have a very small fraction which is not being displayed, or your proc import I guess your using is guessing the column wrong (proc import is a guessing procedure and when combined with the general mess of Excel files results in rubbish half the time).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I added a data step with an if statement that fixes the issue I was just hoping the xlsx import could be fixed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Open the Excel file.
Go the cell with the value that is "incorrect" in SAS.
Change the number display format to include 16 or 17 decimals.
See if the value in Excel was only displaying 0.07 actually had 0.070000000000000007
As mentioned: Excel hides things. A lot.
Also Proc Import will guess as to an appropriate format and if the value for a specific cell can't be stored as a numeric (the number of digits in value 0.07 SAS is importing it in as the string value 7.0000000000000007E-2. exceeds decimal precision in SAS) then you may get a string.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I checked out the original document and I don't see any hidden values
0.0700000000000000000000000 |
Also it isn't just a single cell that is incorrect all cells with the value 0.07 are being imported incorrectly.
I made a copy and converted the document into CSV. The values are reading 0.07 in the CSV.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That number you show is not 0.07 as you previously stated, it is a large number. Any reason why there is all those extra 0's in place, I suspect at some point in the creation of the data there has been a calculation which has created a tiny fraction, you can find some examples:
Excel likely has a flag in the background, or keeps the extra bit in a special field - all reasons why Excel should not be used.
Its a question that often comes up, and for me Excel is not something which can be used in any form of data capture/storage/transfer due to these types of things. Hardcoding changes in your program isn't really GxP either. If I was forced into doing this, I would either get management written approval to apply the hardcoding, or convert to CSV, QC, and import that.