I have an odd exprience when rounding a number.
One number is rounded down and the other up:
data rounding; d1=round(201578736.46500, .01); d2=round(276578736.46500, .01); run; proc print data=rounding noobs; run;
The first one rounds to .47 and the other one to .46
Are there some kind of limit on the input to round()?
Hello @SHFBD and welcome to the SAS Support Communities!
As PaigeMiller has mentioned, this has to do with the unavoidable loss of precision that typically occurs when decimal fractions are stored as binary representations.
Let's take a look at how SAS stores these two numbers, 201578736.46500 and 276578736.46500, in a numeric variable of length 8 (bytes) under Windows or Unix.
The decimal number 201578736.465, converted to the binary system (and with some line breaks inserted), looks like this:
1100000000111101100011110000.01110111000010100011110 10111000010100011110 10111000010100011110 101110...
The 20-binary-digit pattern "10111000010100011110" is repeated infinitely many times ("periodic fraction").
The internal 64-bit floating-point representation that SAS under Windows or Unix uses to store the number in an 8-byte numeric variable is based on the above binary representation. However, only 52 of the 64 bits (let alone infinitely many) are available for the binary digits following the first "1". The remaining binary digits are rounded (in this particular example: rounded up), as can be seen with the BINARY64. format:
data _null_;
x=201578736.465;
put x binary64.;
run;
Result:
0100000110101000000001111011000111100000111011100001010001111011
From the mathematical binary representation further above we know that the last two bits, 11, resulted from rounding 1011100001... to 1100000000... Apart from this, the 52 "mantissa bits" (in bold face above) are copied from the periodic binary representation and we recognize the repeating 20-binary-digit pattern (highlighted in blue). If we convert this rounded binary representation back to the decimal system, we obtain
201578736.4650000035762786865234375
and the (so called) numeric representation error of approx. 3.576E-9 becomes obvious. Given that this number is even slightly larger than 201578736.465, we're not surprised that the ROUND function (with a rounding unit of 0.01) rounds it correctly to 201578736.47.
Now we do the same investigation with 276578736.465:
The mathematical conversion to the binary system results in:
10000011111000100000110110000.01110111000010100011110 10111000010100011110 10111000010100011110 101110...
The fractional part is, of course, identical to that of the first number because it represents the decimal fraction of 0.465 in both cases. Only the integer part has changed. Note that it has become longer! Since we have crossed a power of 2, namely 2**28 = 268435456, we now have 29 rather than 28 binary digits on the left of the point. As a consequence, the rounding for the internal representation occurs at a different digit! Indeed, the mantissa is now rounded down, as can be seen again with the BINARY64. format:
data _null_;
x=276578736.465;
put x binary64.;
run;
Result:
0100000110110000011111000100000110110000011101110000101000111101
Once more thanks to the exact periodic fraction, we know that the last bit, 1, resulted from rounding 1011100001... to 1000000000... If we convert this rounded binary representation back to the decimal system, we obtain
276578736.464999973773956298828125
and the representation error of approx. -2.623E-8 becomes obvious. Given that this number is slightly less than 276578736.465, it seems now plausible that it is rounded down to 276578736.46.
For a full explanation, however, we would need to know the internal workings of the ROUND function. Suffice it to say that the ROUND function performs some "extra computations, called fuzzing, to try to make the result agree with decimal arithmetic in the most common situations" (documentation; see also the ROUNDZ function, which "does not fuzz the result"). Apparently, for the second number the amount of fuzzing applied by the ROUND function does not suffice to compensate for the numeric representation error -2.623E-8, which is much larger (EDIT: in absolute value) than it would be for a number with a magnitude in the thousands or so. I would not expect such incorrect rounding to occur for much smaller numbers like 1234.465.
For more details and strategies for handling this issue (such as "defining your own rounding routine") please see http://support.sas.com/resources/papers/dealing-with-numeric-representation-error-in-sas-application... (in particular p. 11 ff.).
Most numbers that are not integers cannot be represented exactly in digital computers. This is called "machine precision". The numbers can be represented only to within ± a small value epsilon. So the problem isn't the ROUND() function, the problem is a systematic problem that all computers have.
Hello @SHFBD and welcome to the SAS Support Communities!
As PaigeMiller has mentioned, this has to do with the unavoidable loss of precision that typically occurs when decimal fractions are stored as binary representations.
Let's take a look at how SAS stores these two numbers, 201578736.46500 and 276578736.46500, in a numeric variable of length 8 (bytes) under Windows or Unix.
The decimal number 201578736.465, converted to the binary system (and with some line breaks inserted), looks like this:
1100000000111101100011110000.01110111000010100011110 10111000010100011110 10111000010100011110 101110...
The 20-binary-digit pattern "10111000010100011110" is repeated infinitely many times ("periodic fraction").
The internal 64-bit floating-point representation that SAS under Windows or Unix uses to store the number in an 8-byte numeric variable is based on the above binary representation. However, only 52 of the 64 bits (let alone infinitely many) are available for the binary digits following the first "1". The remaining binary digits are rounded (in this particular example: rounded up), as can be seen with the BINARY64. format:
data _null_;
x=201578736.465;
put x binary64.;
run;
Result:
0100000110101000000001111011000111100000111011100001010001111011
From the mathematical binary representation further above we know that the last two bits, 11, resulted from rounding 1011100001... to 1100000000... Apart from this, the 52 "mantissa bits" (in bold face above) are copied from the periodic binary representation and we recognize the repeating 20-binary-digit pattern (highlighted in blue). If we convert this rounded binary representation back to the decimal system, we obtain
201578736.4650000035762786865234375
and the (so called) numeric representation error of approx. 3.576E-9 becomes obvious. Given that this number is even slightly larger than 201578736.465, we're not surprised that the ROUND function (with a rounding unit of 0.01) rounds it correctly to 201578736.47.
Now we do the same investigation with 276578736.465:
The mathematical conversion to the binary system results in:
10000011111000100000110110000.01110111000010100011110 10111000010100011110 10111000010100011110 101110...
The fractional part is, of course, identical to that of the first number because it represents the decimal fraction of 0.465 in both cases. Only the integer part has changed. Note that it has become longer! Since we have crossed a power of 2, namely 2**28 = 268435456, we now have 29 rather than 28 binary digits on the left of the point. As a consequence, the rounding for the internal representation occurs at a different digit! Indeed, the mantissa is now rounded down, as can be seen again with the BINARY64. format:
data _null_;
x=276578736.465;
put x binary64.;
run;
Result:
0100000110110000011111000100000110110000011101110000101000111101
Once more thanks to the exact periodic fraction, we know that the last bit, 1, resulted from rounding 1011100001... to 1000000000... If we convert this rounded binary representation back to the decimal system, we obtain
276578736.464999973773956298828125
and the representation error of approx. -2.623E-8 becomes obvious. Given that this number is slightly less than 276578736.465, it seems now plausible that it is rounded down to 276578736.46.
For a full explanation, however, we would need to know the internal workings of the ROUND function. Suffice it to say that the ROUND function performs some "extra computations, called fuzzing, to try to make the result agree with decimal arithmetic in the most common situations" (documentation; see also the ROUNDZ function, which "does not fuzz the result"). Apparently, for the second number the amount of fuzzing applied by the ROUND function does not suffice to compensate for the numeric representation error -2.623E-8, which is much larger (EDIT: in absolute value) than it would be for a number with a magnitude in the thousands or so. I would not expect such incorrect rounding to occur for much smaller numbers like 1234.465.
For more details and strategies for handling this issue (such as "defining your own rounding routine") please see http://support.sas.com/resources/papers/dealing-with-numeric-representation-error-in-sas-application... (in particular p. 11 ff.).
Thank you for a very comprehensive answer.
Maybe I just have to accept this loss of precision, but then I would expect it to be the same for all roundings on 276578736.46500, right? But if I use input(276578736.46500, 15.2) it rounds to .47, aswell as if I use "format=15.2" in PROC SQL?
data rounding;
d1=round(201578736.46500, .01); /*28 bit integer*/
d2=round(276578736.46500, .01); /*29 bit integer*/
d3=round(971578736.46500, .01); /*30 bit integer*/
d4=input(276578736.46500, 15.2);
run;
proc print data=rounding noobs;
run;
You're welcome.
@SHFBD wrote:
Thank you for a very comprehensive answer.
Maybe I just have to accept this loss of precision, but then I would expect it to be the same for all roundings on 276578736.46500, right? But if I use input(276578736.46500, 15.2) it rounds to .47, aswell as if I use "format=15.2" in PROC SQL?
No, that's not how SAS does it. The ROUND function uses a different (I think: more sophisticated) algorithm than a mere numeric format.
Your example using the INPUT function actually involves an automatic numeric-to-character conversion (see note in the log), which is done using the BEST12. format. So, the relevant intermediate result is that of
put(276578736.46500, best12.)
which is the character string '276578736.47' (rounded up, correctly, but not matching the result of the ROUND function). Then the decimal specification ".2" of the informat 15.2 is ignored because the string contains a decimal point. Applying the format 15.2 is a different case, but the result for this particular number is the same as with BEST12. (except for three leading blanks).
I think you obtain results which are more consistent with the internal binary representation if you switch the DECIMALCONV= system option from COMPATIBLE (the default) to STDIEEE:
options decimalconv=stdieee;
Now both the BEST12. and the 15.2 format round 276578736.465 down to 276578736.46 as does the ROUND function for the reason I explained earlier.
The result in variable d3 can be explained in the same way as in the other two examples. The internal binary floating-point representation of 971578736.46500 (under Windows), converted back to the decimal system is the number
971578736.46500003337860107421875
and the ROUND function (with rounding unit 0.01) rounds it up.
Thank you again 😃
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.