BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kurt_Bremser
Super User

@Rick_SAS wrote:

Kurt, I haven't seen explicit code from you, so perhaps I am misunderstanding, but I think in principle Paige is correct. If the two numbers are on opposite sides of a cutpoint, the rounding will take them in opposite directions.

 

data A;
epsilon = 1e-16;
x = 0.00005;
y = x - epsilon;
RndX = round(x, 0.0001);
RndY = round(y, 0.0001);
put RndX=;
put RndY=;
run;

You are right, that's an edge case where round() can't fix it.

ChrisNZ
Tourmaline | Level 20

@PaigeMiller >And I'm still not convinced that round is a solution for all possible floating point numbers.

My experience is that rounding always works if done appropriately. Since the error is always very small, rounding by say a billionth gets rid of it.

To reuse @Rick_SAS 's code:

data A;
  X = 0.00005;
  Y = 0.00005 - 1e-16; 
  A=X=Y; 

  RndX = round(X, 1e-9);
  RndY = round(Y, 1e-9);

  B=RndX=RndY; 
  C=RndY=0.00005;

  putlog A= B= C=;
run;

A=0 B=1 C=1

Though I agree than in principle the decimal number cannot be represented, at least rounding correctly homogenises the binary values.

 

FreelanceReinh
Jade | Level 19

@Tom wrote:

The MOD() seems to fix this. Not sure if it does for all values or just this specific one.

The real solution is to ROUND() the value(s) before comparing.


Thanks for pointing this out. It's interesting that the MOD function seems to handle a number of these cases correctly, albeit in general only up to four decimals, e.g., mod(2.12345,1) ne 0.12345, whereas equality holds for decimals .0000 through .9999.

 

I guess this is a result of the "extra computations, called fuzzing, to return an exact zero when the result would otherwise differ from zero because of numerical error" (documentation). The remarkable thing is that this involves creating correct, non-trivial binary digits of the result which are not contained in the argument, especially if the integer part is large.

 

Example:

data _null_;
x=1414.2135;
y=mod(x,1);
z=.2135;
put x= binary64. / (y z)(+13 =binary64. /);
run;

Result:

x=0100000010010110000110001101101010011111101111100111011011001001
             y=0011111111001011010100111111011111001110110110010001011010000111
             z=0011111111001011010100111111011111001110110110010001011010000111

I've indented the (equal) values of y and z in order to align the parts of the mantissas which represent the fractional parts of the three numbers: For y and z, these are the 39 blue and the 13 green bits. The latter are not contained in the internal representation of x because 10 bits (red) are occupied by the integer part (1414) and 3 additional bits (purple) are needed for the fractional part (without normalization). Moreover, the least significant bit is rounded up.

 

I virtually always use the ROUND function (typically with rounding units such as 1e-9) when comparing numeric values with a small to moderate number of decimals.

 

ChrisNZ
Tourmaline | Level 20

@FreelanceReinh  > I  virtually always use the ROUND function (typically with rounding units such as 1e-9) when comparing numeric values ...

 

Proud to be aligned with an expert here! 🙂

 

Rick_SAS
SAS Super FREQ

I'm late to this discussion, but a guiding principle for numerical computations is "don't perform exact comparisons on floating-point numbers." You can use the COMPFUZZ function in Base SAS to help compare two values for equality.

 

For inequalities, the typical fuzzy comparison is to compare a difference with a "small value." "Small" is relative to the magnitude of the numbers. For example: [EDIT: I had signs wrong originally]

  • Instead of testing x < y, you can test (x - y) < -epsilon
  • Instead of x > y, you can test (x - y) > epsilon

The article "Avoid testing floating-point values for equality" includes references and some discussion of how to choose epsilon.

FreelanceReinh
Jade | Level 19

Hi @Rick_SAS,

 

I think the signs of "epsilon" in your inequalities should be reversed (unless epsilon is negative), assuming that the intention is to make the inequality stronger.

Rick_SAS
SAS Super FREQ

The equations are what I intended to write. Whether you agree with them depends on what you want to test. If I want to do a fuzzy test of 

x > y

then I am willing to allow x to be a tiny bit larger than y due to numerical roundoff. Thus I decide to test

x > y + epsilon

or 

x - y > epsilon

A similar argument holds for the less-than inequality.

FreelanceReinh
Jade | Level 19

@Rick_SAS wrote:

The equations are what I intended to write. Whether you agree with them depends on what you want to test. If I want to do a fuzzy test of 

x > y

then I am willing to allow x to be a tiny bit larger than y due to numerical roundoff. Thus I decide to test

x > y + epsilon

or 

x - y > epsilon

A similar argument holds for the less-than inequality.


Now the inequality is correct and matches my understanding, but in your previous post (quoted below) the signs are reversed, aren't they?

 

@Rick_SAS wrote:

(...)

  • Instead of testing x < y, you can test (x - y) < epsilon
  • Instead of x > y, you can test (x - y) > -epsilon

 

ChrisNZ
Tourmaline | Level 20

The missing function:

options cmplib=WORK.FUNCS;        

proc fcmp outlib=WORK.FUNCS.MATH;
  function equalz(A,B);
    if      missing(A) & missing(B) then return ( A = B            ) ;
    else if missing(A) | missing(B) then return ( 0                ) ;
    else                                 return ( abs(A-B) < 1e-12 ) ;
  endsub;
run;

 

Rick_SAS
SAS Super FREQ

Yes, you are correct. Thanks for the sharp eyes. I edited my original response to correct the mistake.

FreelanceReinh
Jade | Level 19

Hi @SP_SAS,

 

Here is what happens internally in the calculation 2.55-2:

In the binary system (mathematically), the decimal number 2.55 looks like this:

10.1000110011001100110011001100110011001100110011001100110011001100...

The pattern "0011" is infinitely repeating.

 

The internal 64-bit floating-point representations of 2.55, 2.55−2 and 0.55 in SAS under Windows (as documented in Numerical Accuracy in SAS Software) can be displayed with the BINARY64. format:

720  data _null_;
721  do x=2.55, 2.55-2, 0.55;
722    put x binary64.;
723  end;
724  run;

0100000000000100011001100110011001100110011001100110011001100110
0011111111100001100110011001100110011001100110011001100110011000
0011111111100001100110011001100110011001100110011001100110011010

You see the repeating pattern here as well. The binary digits highlighted in blue represent the fractional parts of the numbers. For 2.55 two more bits (red and purple) are needed due to the non-zero integer part: These are the two digits surrounding the point (".") in the mathematical binary representation above. When 2 is subtracted from 2.55, the integer part cancels out, these two bits are removed, the subsequent bits are shifted to the left and leave room for two bits at the end.

In the result 2.55−2 these two least significant bits are simply filled with zeros (highlighted in bold) because the representation of 2.55 doesn't contain the corresponding information. The best possible representation of 0.55, however, contains non-trivial digits in these places: The exact last two digits 01, followed by 10011001... in the mathematical representation, are rounded up to 10.

 

So, the difference between 0.55 and 2.55−2 in the internal representation is the "1" in the penultimate bit. This "1" stands for the value 2**-52=2.2204...E-16 (see @Tom's variable diff).

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 25 replies
  • 1185 views
  • 21 likes
  • 8 in conversation