Re: Numeric field value different from actual value

ItsMeAG · Posted 08-28-2025 04:51 PM

Hi Friends,

See the following code

data tmp1;

a = 0.8;

b = 0.6;

c = 0.4;

d = 0.2;

total = sum(a,b,c,d);

run;

data tmp2;

set tmp1;

if total >= 2 then flag = 1;

else flag = 0;

diff = sum(-2,total);

format diff 32.25;

run;

print of tmp2:

Here I am getting total = 2 , but when it comes to if condition it's not satisfying '>= 2' and getting flag values as 0.

I tried to see the difference of total with 2, and getting "-0.00000000000000022204".

Can someone please help me to understand why the value of total is not actually 2 and how to solve it.

Thanks in advance for your time and support!

Tom · Posted 08-28-2025 09:46 PM

Because SAS (and most computer languages) use BINARY (base 2) floating point numbers. Not the decimal numbers (base 10) that humans like to use.

None of those values can be exactly represented as binary numbers. And when you add them all up the difference is enough that the normal equality test considers the values to be unequal.

Why not just the ROUND() function to reduce the number to a reasonable number of digits.

data test;
  input x1-x4;
  total = sum(of x1-x4);
  diff = total-2;
  if total=2 then put TOTAL= 'is exactly 2.';
  if round(total,0.0001)=2 then put TOTAl= 'is close enough to 2.'
    / 'since difference is ' diff:32.25 
  ;
cards;
.8 .6 .4 .2
;

WarrenKuhfeld · Posted 09-02-2025 02:18 PM

In addition to the good advice you have already received, I would add this. Look at the options in PROC COMPARE so that you can see a variety of methods for comparing computed numbers that work for a variety of situations. https://documentation.sas.com/doc/en/pgmsascdc/v_066/proc/p0bbu58eqgufwzn16zafm1hvzfw2.htm

Patrick · Posted 08-28-2025 09:54 PM

In the SAS docu under Numeric Precision some more detail for what @Tom already explained.

FreelanceReinh · Posted 09-02-2025 01:46 PM

Hi @ItsMeAG,

@ItsMeAG wrote:

This is one of the more interesting examples of the usual numeric precision issues, as it sheds light on how the calculation is done internally.

As Tom has pointed out, the internal 64-bit binary floating-point representations of the four numbers 0.8, 0.6, 0.4 and 0.2 are inexact. They contain unavoidable rounding errors because those numbers are repeating fractions in the binary number system (with 4-bit sequences repeating infinitely):

decimal   binary representations
value     exact           rounded to 53 signif. bits (as in 8-byte SAS variables)     rounding error
-------   -------------------------------------------------------------------------   ---------------
0.8       0.11001100...   0.11001100110011001100110011001100110011001100110011010     -0.8*2**-54
0.6       0.10011001...   0.10011001100110011001100110011001100110011001100110011      0.4*2**-54
0.4       0.01100110...   0.011001100110011001100110011001100110011001100110011010    -0.4*2**-54
0.2       0.00110011...   0.0011001100110011001100110011001100110011001100110011010   -0.2*2**-54

Yet, adding the above rounded binary numbers in one go would yield 2 + 2**-54 (as you can see by adding the last few bits manually), which would be rounded to the exact value 2. So, SAS must do something else. There is strong evidence (from millions of other examples) that the sum is calculated in three steps as ((a+b)+c)+d, i.e., the result of a+b=0.8+0.6 is rounded to 53 significant bits before 0.4 is added. Finally, 0.2 is added to the rounded result of the previous calculation.

It turns out that the second of the three steps mentioned above introduces an excessive rounding error (0.8*2**-52) -- causing the intermediate result 1.4+0.4 < 1.8 *. In the third step the rounding error is further increased to 2**-52, the negative value of which is what you obtained in your variable DIFF.

* see this log:

109   data _null_;
110   if 1.4+0.4 < 1.8 then put 'Surprised?';
111   run;

Surprised?

@ItsMeAG wrote:

(...)

Here I am getting total = 2 , ...

No, you are getting a value that is displayed as 2 in the default numeric format. Exact formats such as BINARY64. or HEX16. would show (like variable DIFF) that the value of TOTAL slightly differs from 2.

Kurt_Bremser · Posted 09-02-2025 03:27 PM

It's perfectly logical that

a+b+c+d

is done as

((a+b)+c)+d

The ALU (Arithmetic-Logical Unit) in a processor can only do operations with two numbers/values at a time, and as long as there's no other precedence, the calculation is done left to right.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

FreelanceReinh · Posted 09-02-2025 05:04 PM

@Kurt_Bremser wrote:
The ALU (Arithmetic-Logical Unit) in a processor can only do operations with two numbers/values at a time, and as long as there's no other precedence, the calculation is done left to right.

Thanks for contributing this great argument, which is also consistent with the "left to right" order of evaluation in "Group 3" of the table "Order of Operation in Compound Expressions."

Numeric field value different from actual value