Neither of the following codes filters the rows properly. The discount column was initially formatted as a percentage, I changed it to Comma32.28 to see all decimals but there is nothing but zeros in the decimals. So, if I write WHERE Discount > 0.3 it shouldn't return columns where the discount value is 0.3, am I wrong?
What is wrong with this program? Why it keeps including equality when I specifically say > or GT. How is 0.300000000000000000000 greater than 0.3? Am I missing something?
Has anyone else encountered a similar problem? Is it because of lack of ORDER BY or something? Couldn't think of anything else...
Hello @ikoyuncu,
It's true that SAS cannot store 0.3 exactly in its 64-bit binary floating-point representation -- indeed, in the binary system 0.3 is a periodic fraction: 0.0100110011001..., so the internal representation is rounded and under Windows it really corresponds to 0.299999999999999988897769753748434595763683319091796875 (= 0.3 - 2**-54 / 5 = 0.3 - 1.1102...E-17) -- but I wouldn't say that the numeric representation error (-1.11...E-17) of this particular number is the main reason for the rounding error in your variable Discount (whose value in the cases in question must differ from 0.3 by more than the unavoidable numeric representation error).
Virtually all numbers, including (what should be) integers, can be "contaminated" with similar rounding errors.
Examples (log from SAS 9.4M5 under Windows):
409 data _null_; 410 if 1.4-0.4 < 1 < 2.2-1.2 411 & 1.005*1000 < 1005 < 10.05*100 412 & 0.6/0.2 < 3 < 9.9/3.3 413 then put 'Surprised?'; 414 run; Surprised? NOTE: DATA statement used (Total process time):
Calculations with numbers (like 0.3 and many others) affected by numeric representation error can easily produce rounding errors making the results different from what they should be mathematically, as shown in the examples above and below.
So the question is rather how those Discount values only looking like 0.3 or 0.3000... (in standard formats) were created. I suspect that they were calculated from amounts of money with their typical (up to) two decimal places. (Note that, in a sense, 96 percent of the numbers with up to 2 decimal places are affected by numeric representation error.)
Examples:
data test;
input x y;
Discount=x/y;
length c $12;
h=put(Discount, hex16.);
diff=Discount-0.3;
if Discount<0.3 then c='Discount<0.3';
else if Discount=0.3 then c='Discount=0.3';
else if Discount>0.3 then c='Discount>0.3';
cards;
2.7 9
240.03 800.1
2100.99 7003.3
;
Result:
Obs x y Discount c h diff 1 2.70 9.0 0.3 Discount>0.3 3FD3333333333334 5.55112E-17 2 240.03 800.1 0.3 Discount=0.3 3FD3333333333333 0 3 2100.99 7003.3 0.3 Discount<0.3 3FD3333333333332 -5.5511E-17
Mathematically, all three Discount values should be 0.3, but only the result in obs. 2 happens to match the internal representation of 0.3. The other two results contain rounding errors in the least significant bit (place value: 2**-54 = 5.551...E-17) which make them different from 0.3 in the IF conditions. This is obvious in formats such as HEX16. revealing the internal representation, but often not in common formats like the standard w.d format or the COMMAw.d format because they are rounding. For instance, format 32.30 reveals a rounding error in obs. 3, but not in obs. 1, unless you look at the difference diff.
A definition of Discount using the ROUND function such as
Discount=round(x/y, 1e-9);
would have avoided the issues in the examples and in similar cases. Alternatively, the ROUND function can be applied in the comparison (as Tom has suggested) or in a preliminary "data cleaning" step, e.g., if the Discount values were imported from another database.
SAS (and pretty much all computers) store numbers using BINARY representation. There is no way to represent 3/10ths exactly using binary numbers. The value that your query is saying is larger then 0.3 is larger than 0.3 by such a small amount that it does not display when printed as a decimal fraction.
Round the value and then test.
How many digits of precision do those numbers really have? Let's say it is to the thousandths place. So round to the thousandths place or the ten-thousands place and then compare.
WHERE round(Discount,0.0001) > 0.3
@Tom answered.
0.3 cannot be represented exactly as a power of 2, just like 1/3 cannot be represented in the decimal base. Look up numerical precision if you want to know more.
> How is 0.300000000000000000000 greater than 0.3?
You can add as many zeros as you want in the code, it makes no difference as the computer cannot use them when storing the decimal number in 8 bytes.
.
For more about the nuances of floating point math and the apparent inequality of certain numbers that "look" the same, see Numerical precision in SAS. But as others said, this isn't unique to SAS -- it's an IEEE standard for how floating point numbers are represented in systems/programming languages in general.
Hello @ikoyuncu,
It's true that SAS cannot store 0.3 exactly in its 64-bit binary floating-point representation -- indeed, in the binary system 0.3 is a periodic fraction: 0.0100110011001..., so the internal representation is rounded and under Windows it really corresponds to 0.299999999999999988897769753748434595763683319091796875 (= 0.3 - 2**-54 / 5 = 0.3 - 1.1102...E-17) -- but I wouldn't say that the numeric representation error (-1.11...E-17) of this particular number is the main reason for the rounding error in your variable Discount (whose value in the cases in question must differ from 0.3 by more than the unavoidable numeric representation error).
Virtually all numbers, including (what should be) integers, can be "contaminated" with similar rounding errors.
Examples (log from SAS 9.4M5 under Windows):
409 data _null_; 410 if 1.4-0.4 < 1 < 2.2-1.2 411 & 1.005*1000 < 1005 < 10.05*100 412 & 0.6/0.2 < 3 < 9.9/3.3 413 then put 'Surprised?'; 414 run; Surprised? NOTE: DATA statement used (Total process time):
Calculations with numbers (like 0.3 and many others) affected by numeric representation error can easily produce rounding errors making the results different from what they should be mathematically, as shown in the examples above and below.
So the question is rather how those Discount values only looking like 0.3 or 0.3000... (in standard formats) were created. I suspect that they were calculated from amounts of money with their typical (up to) two decimal places. (Note that, in a sense, 96 percent of the numbers with up to 2 decimal places are affected by numeric representation error.)
Examples:
data test;
input x y;
Discount=x/y;
length c $12;
h=put(Discount, hex16.);
diff=Discount-0.3;
if Discount<0.3 then c='Discount<0.3';
else if Discount=0.3 then c='Discount=0.3';
else if Discount>0.3 then c='Discount>0.3';
cards;
2.7 9
240.03 800.1
2100.99 7003.3
;
Result:
Obs x y Discount c h diff 1 2.70 9.0 0.3 Discount>0.3 3FD3333333333334 5.55112E-17 2 240.03 800.1 0.3 Discount=0.3 3FD3333333333333 0 3 2100.99 7003.3 0.3 Discount<0.3 3FD3333333333332 -5.5511E-17
Mathematically, all three Discount values should be 0.3, but only the result in obs. 2 happens to match the internal representation of 0.3. The other two results contain rounding errors in the least significant bit (place value: 2**-54 = 5.551...E-17) which make them different from 0.3 in the IF conditions. This is obvious in formats such as HEX16. revealing the internal representation, but often not in common formats like the standard w.d format or the COMMAw.d format because they are rounding. For instance, format 32.30 reveals a rounding error in obs. 3, but not in obs. 1, unless you look at the difference diff.
A definition of Discount using the ROUND function such as
Discount=round(x/y, 1e-9);
would have avoided the issues in the examples and in similar cases. Alternatively, the ROUND function can be applied in the comparison (as Tom has suggested) or in a preliminary "data cleaning" step, e.g., if the Discount values were imported from another database.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.