Solved: Re: Decimals calculation problem

Maplefin · Posted 03-31-2025 02:09 AM

Hi, I recently find an issue that SAS may not get the correct answer for decimals calculation. My code is as below:

data x;
a=5.7;
b=4.75;
c=(a-b)/b;
if c=0.2 then flag="Y";
run;

As you can see, I want get the proportion of (a-b) to b. It should be 0.2, and it seems that SAS give the right answer. But to my surprise, the value of variabel Flag is equal to "Y", which means c is not strictly equal to 0.2. In fact, c is over 0.2. Can someone explain it to me? I really want to know the mechanism behind it.

andreas_lds · Posted 03-31-2025 02:17 AM

Just have a look at https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lepg/p0dv87zb3bnse6n1mqo360be70qr.htm

- the sections stating with "Floating-Point" should explain the problem.

View solution in original post

andreas_lds · Posted 03-31-2025 02:17 AM

Just have a look at https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lepg/p0dv87zb3bnse6n1mqo360be70qr.htm

- the sections stating with "Floating-Point" should explain the problem.

Ksharp · Posted 03-31-2025 02:29 AM

That is because computer is unable to store 0.2 exactly , same to other languages Java, Python, R.........

You need to round it before comparing it .

data x;
a=5.7;
b=4.75;
c=(a-b)/b;
/*if round(c,1e-12)=0.2 then flag="Y";*/
if round(c,0.000000000001)=0.2 then flag="Y";
run;

Maplefin · Posted 03-31-2025 03:01 AM

Thanks, to round the decimals may be the good solution.

Ksharp · Posted 03-31-2025 02:34 AM

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lepg/n0o6e1t72yad42n10y9wf8pgmat8.htm#p0lghy6...

FreelanceReinh · Posted 03-31-2025 08:15 AM

Hi @Maplefin,

Glad to see you found the practical ROUND solution to your issue.

So this is yet another example where numeric representation error and arithmetic calculations with numbers of limited precision entail tiny differences between the computer's results (not SAS-specific) and the results expected from using the decimal system. It's a particularly nice example, so let's take a look at the internal workings:

First, note that your starting values a and b look quite different in the binary system:

a=101.1011001100110...
b=100.11

While b=4.75=19/4 has an exact, finite binary representation using only five binary digits, a=5.7 is a repeating fraction in the binary system: The four-bit pattern "0110" highlighted in green is repeated forever. As a consequence, an unavoidable numeric representation error (here: 1.776E-16) occurs when 5.7 is rounded to fit into the 64-bit space (actually: 52 mantissa bits + 1 implied bit) available internally:

101.10110011001100110011001100110011001100110011001101

The last, 53rd bit (highlighted in red) has been rounded up and the decimal equivalent of this number is this:

5.70000000000000017763568394002504646778106689453125

Thanks to the simplicity of b in the binary system, the subtraction a-b is easy and results in the binary fraction

0.11110011001100110011001100110011001100110011001101000

The rounded bit from above (red) is now the 50th bit, hence more significant than it was before. Indeed, the former numeric representation error has turned into a noticeable rounding error, as the number is slightly larger than the "expected" 0.95:

122   data _null_;
123   if 5.7 - 4.75 > 0.95 then put 'Surprised?';
124   run;

Surprised?

Now you will not be surprised that the "red" bit also causes trouble in the division (a-b)/b. This calculation is a bit tedious (if done by hand), yet relatively simple due to the shortness of the binary representation of b. Its result is (after rounding up the 53rd bit):

0.0011001100110011001100110011001100110011001100110011011

which differs (in the least significant bit) from the correctly rounded repeating fraction being the binary equivalent of the decimal 0.2

0.0011001100110011001100110011001100110011001100110011010

Here comes the punchline: Ironically, even putting the "exact" numeric literal 0.95 (rather than the deviating a-b) into the division does not yield the perfect result. Instead, the numeric representation error of 0.95 (making this number internally a bit smaller than it should be: 0.9499999999999999555910790149937383830547332763671875) causes a too small result:

0.0011001100110011001100110011001100110011001100110011001

so that, again, an exact equality check in a DATA step would fail:

134   data _null_;
135   if 0.95/4.75 < 0.2  then put 'Surprised again?';
136   run;

Surprised again?

Kurt_Bremser · Posted 03-31-2025 12:34 PM

Great explanation.

Everything would be much easier if we humans had evolved with only four fingers.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Maplefin · Posted 03-31-2025 11:17 PM

Great! Thanks for your detailed explanation!

SAS Training: Just a Click Away