@Rick_SAS wrote:
Kurt, I haven't seen explicit code from you, so perhaps I am misunderstanding, but I think in principle Paige is correct. If the two numbers are on opposite sides of a cutpoint, the rounding will take them in opposite directions.
data A; epsilon = 1e-16; x = 0.00005; y = x - epsilon; RndX = round(x, 0.0001); RndY = round(y, 0.0001); put RndX=; put RndY=; run;
You are right, that's an edge case where round() can't fix it.
@PaigeMiller >And I'm still not convinced that round is a solution for all possible floating point numbers.
My experience is that rounding always works if done appropriately. Since the error is always very small, rounding by say a billionth gets rid of it.
To reuse @Rick_SAS 's code:
data A;
X = 0.00005;
Y = 0.00005 - 1e-16;
A=X=Y;
RndX = round(X, 1e-9);
RndY = round(Y, 1e-9);
B=RndX=RndY;
C=RndY=0.00005;
putlog A= B= C=;
run;
A=0 B=1 C=1
Though I agree than in principle the decimal number cannot be represented, at least rounding correctly homogenises the binary values.
@Tom wrote:
The MOD() seems to fix this. Not sure if it does for all values or just this specific one.
The real solution is to ROUND() the value(s) before comparing.
Thanks for pointing this out. It's interesting that the MOD function seems to handle a number of these cases correctly, albeit in general only up to four decimals, e.g., mod(2.12345,1) ne 0.12345, whereas equality holds for decimals .0000 through .9999.
I guess this is a result of the "extra computations, called fuzzing, to return an exact zero when the result would otherwise differ from zero because of numerical error" (documentation). The remarkable thing is that this involves creating correct, non-trivial binary digits of the result which are not contained in the argument, especially if the integer part is large.
Example:
data _null_;
x=1414.2135;
y=mod(x,1);
z=.2135;
put x= binary64. / (y z)(+13 =binary64. /);
run;
Result:
x=0100000010010110000110001101101010011111101111100111011011001001 y=0011111111001011010100111111011111001110110110010001011010000111 z=0011111111001011010100111111011111001110110110010001011010000111
I've indented the (equal) values of y and z in order to align the parts of the mantissas which represent the fractional parts of the three numbers: For y and z, these are the 39 blue and the 13 green bits. The latter are not contained in the internal representation of x because 10 bits (red) are occupied by the integer part (1414) and 3 additional bits (purple) are needed for the fractional part (without normalization). Moreover, the least significant bit is rounded up.
I virtually always use the ROUND function (typically with rounding units such as 1e-9) when comparing numeric values with a small to moderate number of decimals.
@FreelanceReinh > I virtually always use the ROUND function (typically with rounding units such as 1e-9) when comparing numeric values ...
Proud to be aligned with an expert here! 🙂
I'm late to this discussion, but a guiding principle for numerical computations is "don't perform exact comparisons on floating-point numbers." You can use the COMPFUZZ function in Base SAS to help compare two values for equality.
For inequalities, the typical fuzzy comparison is to compare a difference with a "small value." "Small" is relative to the magnitude of the numbers. For example: [EDIT: I had signs wrong originally]
The article "Avoid testing floating-point values for equality" includes references and some discussion of how to choose epsilon.
Hi @Rick_SAS,
I think the signs of "epsilon" in your inequalities should be reversed (unless epsilon is negative), assuming that the intention is to make the inequality stronger.
The equations are what I intended to write. Whether you agree with them depends on what you want to test. If I want to do a fuzzy test of
x > y
then I am willing to allow x to be a tiny bit larger than y due to numerical roundoff. Thus I decide to test
x > y + epsilon
or
x - y > epsilon
A similar argument holds for the less-than inequality.
@Rick_SAS wrote:
The equations are what I intended to write. Whether you agree with them depends on what you want to test. If I want to do a fuzzy test of
x > y
then I am willing to allow x to be a tiny bit larger than y due to numerical roundoff. Thus I decide to test
x > y + epsilon
or
x - y > epsilon
A similar argument holds for the less-than inequality.
Now the inequality is correct and matches my understanding, but in your previous post (quoted below) the signs are reversed, aren't they?
@Rick_SAS wrote:
(...)
- Instead of testing x < y, you can test (x - y) < epsilon
- Instead of x > y, you can test (x - y) > -epsilon
The missing function:
options cmplib=WORK.FUNCS;
proc fcmp outlib=WORK.FUNCS.MATH;
function equalz(A,B);
if missing(A) & missing(B) then return ( A = B ) ;
else if missing(A) | missing(B) then return ( 0 ) ;
else return ( abs(A-B) < 1e-12 ) ;
endsub;
run;
Yes, you are correct. Thanks for the sharp eyes. I edited my original response to correct the mistake.
Hi @SP_SAS,
Here is what happens internally in the calculation 2.55-2:
In the binary system (mathematically), the decimal number 2.55 looks like this:
10.1000110011001100110011001100110011001100110011001100110011001100...
The pattern "0011" is infinitely repeating.
The internal 64-bit floating-point representations of 2.55, 2.55−2 and 0.55 in SAS under Windows (as documented in Numerical Accuracy in SAS Software) can be displayed with the BINARY64. format:
720 data _null_; 721 do x=2.55, 2.55-2, 0.55; 722 put x binary64.; 723 end; 724 run; 0100000000000100011001100110011001100110011001100110011001100110 0011111111100001100110011001100110011001100110011001100110011000 0011111111100001100110011001100110011001100110011001100110011010
You see the repeating pattern here as well. The binary digits highlighted in blue represent the fractional parts of the numbers. For 2.55 two more bits (red and purple) are needed due to the non-zero integer part: These are the two digits surrounding the point (".") in the mathematical binary representation above. When 2 is subtracted from 2.55, the integer part cancels out, these two bits are removed, the subsequent bits are shifted to the left and leave room for two bits at the end.
In the result 2.55−2 these two least significant bits are simply filled with zeros (highlighted in bold) because the representation of 2.55 doesn't contain the corresponding information. The best possible representation of 0.55, however, contains non-trivial digits in these places: The exact last two digits 01, followed by 10011001... in the mathematical representation, are rounded up to 10.
So, the difference between 0.55 and 2.55−2 in the internal representation is the "1" in the penultimate bit. This "1" stands for the value 2**-52=2.2204...E-16 (see @Tom's variable diff).
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.