BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ikoyuncu
Fluorite | Level 6

Neither of the following codes filters the rows properly. The discount column was initially formatted as a percentage, I changed it to Comma32.28 to see all decimals but there is nothing but zeros in the decimals. So, if I write WHERE Discount > 0.3 it shouldn't return columns where the discount value is 0.3, am I wrong?

ikoyuncu_0-1631235552849.png

What is wrong with this program? Why it keeps including equality when I specifically say > or GT. How is 0.300000000000000000000 greater than 0.3? Am I missing something?

Has anyone else encountered a similar problem? Is it because of lack of ORDER BY or something? Couldn't think of anything else...

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @ikoyuncu,

 

It's true that SAS cannot store 0.3 exactly in its 64-bit binary floating-point representation -- indeed, in the binary system 0.3 is a periodic fraction: 0.0100110011001..., so the internal representation is rounded and under Windows it really corresponds to 0.299999999999999988897769753748434595763683319091796875 (= 0.3 - 2**-54 / 5 = 0.3 - 1.1102...E-17) -- but I wouldn't say that the numeric representation error (-1.11...E-17) of this particular number is the main reason for the rounding error in your variable Discount (whose value in the cases in question must differ from 0.3 by more than the unavoidable numeric representation error).

 

Virtually all numbers, including (what should be) integers, can be "contaminated" with similar rounding errors.

 

Examples (log from SAS 9.4M5 under Windows):

409  data _null_;
410  if 1.4-0.4 < 1 < 2.2-1.2
411   & 1.005*1000 < 1005 < 10.05*100
412   & 0.6/0.2 < 3 < 9.9/3.3
413  then put 'Surprised?';
414  run;

Surprised?
NOTE: DATA statement used (Total process time):

Calculations with numbers (like 0.3 and many others) affected by numeric representation error can easily produce rounding errors making the results different from what they should be mathematically, as shown in the examples above and below.

 

So the question is rather how those Discount values only looking like 0.3 or 0.3000... (in standard formats) were created. I suspect that they were calculated from amounts of money with their typical (up to) two decimal places. (Note that, in a sense, 96 percent of the numbers with up to 2 decimal places are affected by numeric representation error.)

 

Examples:

data test;
input x y;
Discount=x/y;
length c $12;
h=put(Discount, hex16.);
diff=Discount-0.3;
     if Discount<0.3 then c='Discount<0.3';
else if Discount=0.3 then c='Discount=0.3';
else if Discount>0.3 then c='Discount>0.3';
cards;
   2.7     9
 240.03  800.1
2100.99 7003.3
;

Result:

Obs          x         y    Discount         c                 h                diff

 1        2.70       9.0       0.3      Discount>0.3    3FD3333333333334    5.55112E-17
 2      240.03     800.1       0.3      Discount=0.3    3FD3333333333333              0
 3     2100.99    7003.3       0.3      Discount<0.3    3FD3333333333332    -5.5511E-17

 

Mathematically, all three Discount values should be 0.3, but only the result in obs. 2 happens to match the internal representation of 0.3. The other two results contain rounding errors in the least significant bit (place value: 2**-54 = 5.551...E-17) which make them different from 0.3 in the IF conditions. This is obvious in formats such as HEX16. revealing the internal representation, but often not in common formats like the standard w.d format or the COMMAw.d format because they are rounding. For instance, format 32.30 reveals a rounding error in obs. 3, but not in obs. 1, unless you look at the difference diff.

 

A definition of Discount using the ROUND function such as

Discount=round(x/y, 1e-9);

would have avoided the issues in the examples and in similar cases. Alternatively, the ROUND function can be applied in the comparison (as Tom has suggested) or in a preliminary "data cleaning" step, e.g., if the Discount values were imported from another database.

View solution in original post

6 REPLIES 6
Tom
Super User Tom
Super User

SAS  (and pretty much all computers) store numbers using BINARY representation. There is no way to represent 3/10ths exactly using binary numbers.  The value that your query is saying is larger then 0.3 is larger than 0.3 by such a small amount that it does not display when printed as a decimal fraction.

 

Round the value and then test.

How many digits of precision do those numbers really have?  Let's say it is to the thousandths place.   So round to the thousandths place or the ten-thousands place and then compare.

WHERE round(Discount,0.0001) > 0.3
ikoyuncu
Fluorite | Level 6
Thanks, this definitely works. I didn't create the Discount column. There has to be a formula involved when it is first created, the values look like 0.3 but they're not precise.
ChrisNZ
Tourmaline | Level 20

@Tom answered.

0.3 cannot be represented exactly as a power of 2, just like 1/3 cannot be represented in the decimal base. Look up numerical precision if you want to know more.

 

> How is 0.300000000000000000000 greater than 0.3?

You can add as many zeros as you want in the code, it makes no difference as the computer cannot use them when storing the decimal number in 8 bytes.

 

 

.

 

ikoyuncu
Fluorite | Level 6
Thanks, I actually didn't add all those zeros to use. I only formatted the column to see if there are any decimals >0 that I am missing. As others suggested, a preliminary 'data cleaning' solved my problem.
ChrisHemedinger
Community Manager

For more about the nuances of floating point math and the apparent inequality of certain numbers that "look" the same, see Numerical precision in SAS. But as others said, this isn't unique to SAS -- it's an IEEE standard for how floating point numbers are represented in systems/programming languages in general.

Check out SAS Innovate on-demand content! Watch the main stage sessions, keynotes, and over 20 technical breakout sessions!
FreelanceReinh
Jade | Level 19

Hello @ikoyuncu,

 

It's true that SAS cannot store 0.3 exactly in its 64-bit binary floating-point representation -- indeed, in the binary system 0.3 is a periodic fraction: 0.0100110011001..., so the internal representation is rounded and under Windows it really corresponds to 0.299999999999999988897769753748434595763683319091796875 (= 0.3 - 2**-54 / 5 = 0.3 - 1.1102...E-17) -- but I wouldn't say that the numeric representation error (-1.11...E-17) of this particular number is the main reason for the rounding error in your variable Discount (whose value in the cases in question must differ from 0.3 by more than the unavoidable numeric representation error).

 

Virtually all numbers, including (what should be) integers, can be "contaminated" with similar rounding errors.

 

Examples (log from SAS 9.4M5 under Windows):

409  data _null_;
410  if 1.4-0.4 < 1 < 2.2-1.2
411   & 1.005*1000 < 1005 < 10.05*100
412   & 0.6/0.2 < 3 < 9.9/3.3
413  then put 'Surprised?';
414  run;

Surprised?
NOTE: DATA statement used (Total process time):

Calculations with numbers (like 0.3 and many others) affected by numeric representation error can easily produce rounding errors making the results different from what they should be mathematically, as shown in the examples above and below.

 

So the question is rather how those Discount values only looking like 0.3 or 0.3000... (in standard formats) were created. I suspect that they were calculated from amounts of money with their typical (up to) two decimal places. (Note that, in a sense, 96 percent of the numbers with up to 2 decimal places are affected by numeric representation error.)

 

Examples:

data test;
input x y;
Discount=x/y;
length c $12;
h=put(Discount, hex16.);
diff=Discount-0.3;
     if Discount<0.3 then c='Discount<0.3';
else if Discount=0.3 then c='Discount=0.3';
else if Discount>0.3 then c='Discount>0.3';
cards;
   2.7     9
 240.03  800.1
2100.99 7003.3
;

Result:

Obs          x         y    Discount         c                 h                diff

 1        2.70       9.0       0.3      Discount>0.3    3FD3333333333334    5.55112E-17
 2      240.03     800.1       0.3      Discount=0.3    3FD3333333333333              0
 3     2100.99    7003.3       0.3      Discount<0.3    3FD3333333333332    -5.5511E-17

 

Mathematically, all three Discount values should be 0.3, but only the result in obs. 2 happens to match the internal representation of 0.3. The other two results contain rounding errors in the least significant bit (place value: 2**-54 = 5.551...E-17) which make them different from 0.3 in the IF conditions. This is obvious in formats such as HEX16. revealing the internal representation, but often not in common formats like the standard w.d format or the COMMAw.d format because they are rounding. For instance, format 32.30 reveals a rounding error in obs. 3, but not in obs. 1, unless you look at the difference diff.

 

A definition of Discount using the ROUND function such as

Discount=round(x/y, 1e-9);

would have avoided the issues in the examples and in similar cases. Alternatively, the ROUND function can be applied in the comparison (as Tom has suggested) or in a preliminary "data cleaning" step, e.g., if the Discount values were imported from another database.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 3261 views
  • 6 likes
  • 5 in conversation