BookmarkSubscribeRSS Feed
kmw
SAS Employee kmw
SAS Employee

I support numeric precision in SAS Technical Support division.  Instead of creating a SAS note, I worked with our Publications department to add a section to the documentation and the section is titled "Accuracy on x64 Windows Processors".

ChrisNZ
Tourmaline | Level 20

Thanks @kmw

 

The updated document states "The following section shows the conversion process for a decimal number that cannot be represented precisely in floating-point representation. "

 

I don't see how 0.5 cannot be represented precisely in floating-point representation.

 

I reformulated the process described above the new paragraph for value 0.5 instead of 255.75.

3FE0000000000000 is the exact representation.

 

See below:

 

This example shows the conversion process for the decimal value 0.5 to floating-point representation.

  1. Use the base 2 number system to write out the value 0.5 in binary.

Note: Each bit in the mantissa represents a fraction whose numerator is 1 and whose denominator is a power of 2; that is, the mantissa is the sum of a series of fractions such as 1 half , 1 fourth , 1 eighth , and so on. Therefore, for any floating-point number to be represented exactly, you must express it as the previously mentioned sum.

Base 2

 

27

26

25

24

23

22

21

20

.2-1

2-2

128

64

32

16

8

4

2

1

1/2

1/4

255.75 =

0 x 27

0 x 26

0 x 25

0x 24

0 x 23

0 x 22

0 x 21

0 x 20

1 x 2-1

0 x 2-2

So, the value 0.5 is represented in binary format as 0000 0000.10

 

  1. Move the decimal over until there is one digit to the left of it. This process is called normalizing the value. Normalizing a value in scientific notation is the process by which the exponent is chosen so that the absolute value of the mantissa is at least one but less than ten. For this number, you move the decimal point -1 places:

1.000 000 00

Because the decimal point was moved -1 places, the exponent is now -1.

 

  1. The bias is 1023, so add -1 to 1023 to get

1022

 

  1. Convert the decimal value, 1022, to hexadecimal using the base 16 number system:

Base 16

167

...

164

163

162

161

160

268,435,456

...

65,536

4096

256

16

1

1022=3x16^2 + 15*16^1 + 14*16^0

 

  1. The converted hexadecimal value for 1022 will be placed in the exponent portion of the final result.
  2. Convert 3FE to binary format:
    1. 0011 1111 1110
           3     F      E
  3. In Step 2 above, delete the first digit and decimal (the implied one-bit):

0000 0000

  1. Break these up into nibbles (half bytes) so that you have

0000 0000

  1. To have a complete nibble at the end, add enough zeros to complete 4 bits:

0000 0000

  1. Convert

0000 0000

to its hexadecimal equivalent to get the mantissa portion:

0000 0000

0     0    


The final floating-point representation for 0.5 is

3FE0 0000 0000 0000

 

 

 

 

 

 

 

 

 

 

mkeintz
PROC Star

I thank @kmw as well.  The issue needs documentation.

 

I agree entirely with @ChrisNZ comments.  I mentally tracked through the conversion process description in the new documentation for, using .500000000000000000000000000 just as Chris describes, and I still do not see how it generates anything but 3FE0 0000 0000 0000.

 

Something seems to be happening that is not captured in the description as I read it.  If you put a 1 in front of the decimal, then the inequality disappears. Is there something different about the conversion when the absolute value is less than 1?

 

And really, the fact that 2**-1 cannot be accurately stored because of too many trailing insignificant zeroes in the decimal-format ascii token being converted to floating point is hard to accept.

 

Question: is this also the behavior of other languages that generate 8-byte floating points on Windows from ascii numeric literals?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ChrisNZ
Tourmaline | Level 20

@mkeintz No issue for 8-byte numbers in VisualStudio's C++ or C.

 

#include <iostream>

int main() {

	double a, b;

	a = 0.50000000000000000000;
	b = 0.5000000000000000000000000;
	if (a == b) {
		std::cout << "Same\n";
	}
	else {
		std::cout << "Different\n";
	}

}

 Same

 

 

#include <stdio.h>

int main() {

	double a, b;

	a = 0.50000000000000000000;
	b = 0.5000000000000000000000000;
	if (a == b) {
		printf("Same\n");
	}
	else {
		printf("Different\n");
	}
	
}

 

 Same

 

 

 


data _null_;
  A=0.50000000000000000000;
  B=0.5000000000000000000000000;
  if A=B then putlog 'Same'; else putlog 'Different';
run;

 

Different

 

 

ChrisNZ
Tourmaline | Level 20

Also , regarding the explanation:

"The routine used to compute the result is slightly different on Windows than on any other host (Linux, UNIX, AIX, and so on). 

 

It'd be interesting to know if this difference is due to a quirk in Windows or in SAS.

kmw
SAS Employee kmw
SAS Employee

You are correct that .5 can be represented precisely as shown in the first example of the new section of the documentation.  If I'm reading your responses correctly, it seems the line of text just before the new section title is causing the confusion.

 

It states: "The following section shows the conversion process for a decimal number that cannot be represented precisely in floating-point representation.

 

It should instead state: The following section shows the conversion process for a decimal number that cannot be represented precisely in some scenarios.

 

Also @ChrisNZ you asked if the difference is due to a quirk in Windows or SAS.  It's SAS' proprietary way of processing the floating point numbers that go beyond the 15 significant digits.

mkeintz
PROC Star

@kmw

 

Is it safe to say that the SAS proprietary floating point treatment can produce this anomalous impact of trailing non-significant zeroes only for values between -1 and 1?

 

In other words, can adding excessive trailing non-significant zeroes chanage floating point representation for values greater than 1 (or less than -1)?

 

thanks,

Mark

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ChrisNZ
Tourmaline | Level 20
 @kmw The page also states:
Although these values appear to be alike, the internal representations differ slightly, because the IEEE floating-point representation can only represent 15 digits.

This issue pertains to floating-point representation on the x64 processors.

 

I find these 2 statements misleading:

 

1- The 15 digit representation limit does not explain (the word "because" is used) in any way that SAS reads these numbers in this way:

3FE0000000000000 L=21 NB=0.50000000000000000000000

3FE0000000000000 L=22 NB=0.500000000000000000000000

3FDFFFFFFFFFFFFE L=23 NB=0.5000000000000000000000000

3FDFFFFFFFFFFFFE L=24 NB=0.50000000000000000000000000

3FDFFFFFFFFFFFFF L=25 NB=0.500000000000000000000000000

3FE0000000000000 L=26 NB=0.5000000000000000000000000000

3FE0000000000000 L=27 NB=0.50000000000000000000000000000

3FDFFFFFFFFFFFFF L=28 NB=0.500000000000000000000000000000

3FDFFFFFFFFFFFFF L=29 NB=0.5000000000000000000000000000000

3FE0000000000000 L=30 NB=0.50000000000000000000000000000000

3FE0000000000000 L=31 NB=0.500000000000000000000000000000000

3FE0000000000000 L=32 NB=0.5000000000000000000000000000000000

 

 

2- Likewise, the floating-point representation on the x64 processors does not exhibit issues explaining this behaviour of only SAS only under Windows.

 

3-Unrelated to this discussion but related to numerical precision, and much much wider and very puzzling:

The whole numerical precision issue exists because rational numbers were introduced to represent floating-point numbers.This seems like a ludicrous idea.

Rational numbers were introduced to represent the mantissa value when negative exponents were used as part of the algorithm detailed above.
Had the mantissa portion of the number kept positive exponent representations only, with the position of the decimal dot shifted using the exponent portion of the number, this numerical precision issue would not even exist.

There must have been a reason to decide using negative exponents and try to represent the decimal portion of numbers using the sum of rational numbers.

Do you know what on earth could this justification could be?

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 37 replies
  • 2842 views
  • 22 likes
  • 11 in conversation