Solved: Proc Import reading 0.07 string as sceintific

michelconn · Posted 10-17-2017 10:23 AM

I am importing some xlsx sheet into SAS but for some reason the value 0.07 is being imported incorrectly. The document is mixed and all variables are being imported in as character but for some reason anytime there is the value 0.07 SAS is importing it in as the string value 7.0000000000000007E-2.

Any ideas why and how to correct this?

I know I could ad an if statement but I would rather correct it at the import step or understand why SAS is doing what it's doing.

Also touching/converting the original documents isn't an option.

Using SAS 9.4

Thanks

proc import file="&&dirfn&i" out=Xl&i
	DBMS=xlsx replace;
	getnames=NO;
run;

RW9 · Posted 10-17-2017 11:47 AM

That number you show is not 0.07 as you previously stated, it is a large number. Any reason why there is all those extra 0's in place, I suspect at some point in the creation of the data there has been a calculation which has created a tiny fraction, you can find some examples:

https://support.microsoft.com/en-us/help/214118/how-to-correct-rounding-errors-in-floating-point-ari...

https://support.microsoft.com/en-us/help/78113/floating-point-arithmetic-may-give-inaccurate-results...

Excel likely has a flag in the background, or keeps the extra bit in a special field - all reasons why Excel should not be used.

Its a question that often comes up, and for me Excel is not something which can be used in any form of data capture/storage/transfer due to these types of things. Hardcoding changes in your program isn't really GxP either. If I was forced into doing this, I would either get management written approval to apply the hardcoding, or convert to CSV, QC, and import that.

View solution in original post

RW9 · Posted 10-17-2017 10:30 AM

Why isn't converting the original document an option? Its most likely Excel causing the problem in the first instance, it does a good job of hiding most things. If you saveas to a new file (i.e. not modifying the original) then look at say the CSV file which is created, look at the cells with 0.07 and see if they are correct in the CSV file. If so then write a simple datastep import to import the file correctly. This would always be my go for reproducibility, control, and accuracy.

I would guess that the fields with 0.07 displaying either have a very small fraction which is not being displayed, or your proc import I guess your using is guessing the column wrong (proc import is a guessing procedure and when combined with the general mess of Excel files results in rubbish half the time).

michelconn · Posted 10-17-2017 10:37 AM

I don't want to touch the original data since it's government work under Good Laboratory Practices and the the data is from an outside consultant. Messing with the original data would involve getting the PI involved and providing justification. It is possible but I would rather not.

I added a data step with an if statement that fixes the issue I was just hoping the xlsx import could be fixed.

data_null__ · Posted 10-17-2017 10:41 AM

Wouldn't GLP say EXCEL is not a data base?

michelconn · Posted 10-17-2017 10:46 AM

It not really a database it's more various instrument output tables in an excel document. I am converting the data into a SAS database for our statistician to use.

ballardw · Posted 10-17-2017 10:50 AM

Open the Excel file.

Go the cell with the value that is "incorrect" in SAS.

Change the number display format to include 16 or 17 decimals.

See if the value in Excel was only displaying 0.07 actually had 0.070000000000000007

As mentioned: Excel hides things. A lot.

Also Proc Import will guess as to an appropriate format and if the value for a specific cell can't be stored as a numeric (the number of digits in value 0.07 SAS is importing it in as the string value 7.0000000000000007E-2. exceeds decimal precision in SAS) then you may get a string.

michelconn · Posted 10-17-2017 11:08 AM

I checked out the original document and I don't see any hidden values

0.0700000000000000000000000

Also it isn't just a single cell that is incorrect all cells with the value 0.07 are being imported incorrectly.

I made a copy and converted the document into CSV. The values are reading 0.07 in the CSV.

RW9 · Posted 10-17-2017 11:47 AM

That number you show is not 0.07 as you previously stated, it is a large number. Any reason why there is all those extra 0's in place, I suspect at some point in the creation of the data there has been a calculation which has created a tiny fraction, you can find some examples:

https://support.microsoft.com/en-us/help/214118/how-to-correct-rounding-errors-in-floating-point-ari...

https://support.microsoft.com/en-us/help/78113/floating-point-arithmetic-may-give-inaccurate-results...

Excel likely has a flag in the background, or keeps the extra bit in a special field - all reasons why Excel should not be used.

Its a question that often comes up, and for me Excel is not something which can be used in any form of data capture/storage/transfer due to these types of things. Hardcoding changes in your program isn't really GxP either. If I was forced into doing this, I would either get management written approval to apply the hardcoding, or convert to CSV, QC, and import that.

Proc Import reading 0.07 string as sceintific

Re: Proc Import reading 0.07 string as sceintific

Re: Proc Import reading 0.07 string as sceintific

Re: Proc Import reading 0.07 string as sceintific

Re: Proc Import reading 0.07 string as sceintific

Re: Proc Import reading 0.07 string as sceintific

Re: Proc Import reading 0.07 string as sceintific

Re: Proc Import reading 0.07 string as sceintific

Re: Proc Import reading 0.07 string as sceintific

SAS Innovate 2026 Registration is Open

SAS Training: Just a Click Away