Solved: Re: numeric format does not validate intervals properly

ChrisNZ · Posted 07-01-2024 09:00 AM

Consider this code:

proc format; 
  value xx 8.400000000008E+11- 
           8.400000000012E+11='x';
  value yy 8.400000000008E-11- 
           8.400000000012E-11='x';
data _null_;
         A=8.400000000010E+11; put A best20. '/ ' A xx20.;
         B=8.400000000000E+11; put B best20. '/ ' B xx20.;       
         C=8.400000000010E-11; put C best20. '/ ' C yy20.;
         D=8.400000000000e-11; put D best20. '/ ' D yy20.;       
run;

A and C are in the format interval, B and D are not.

This generates:

         840000000001/ x                   
         840000000000/         840000000000
    8.40000000001E-11/ x                   
              8.4E-11/ x

D is formatted when it shouldn't be.

Is this a known behaviour?

High-Performance SAS Coding - Third Edition

FreelanceReinh · Posted 07-01-2024 11:44 AM

Hi @ChrisNZ,

It's the default value 1E-12 of the FUZZ= option which causes this. Use FUZZ=0 (or, e.g., FUZZ=1E-23 in this example) to avoid the unwanted formatting. The numbers 8.400000000008E-11 and 8.400000000000E-11 differ by only 8E-23 < 1E-12.

@PaigeMiller wrote:

SAS cannot represent numbers exactly that are more than about 15 significant digits.

This is true, but the numbers involved here have only up to 13 significant digits, so SAS can handle them fairly well. Only fairly well, though, because the case of 8.400000000012E+11 is an example where the internal representation depends on whether scientific notation is used in the literal:

430   data _null_;
431   x=840000000001.2;
432   y=8.400000000012E+11;
433   if x ne y then put 'Surprise!';
434   put (x y) (=binary64./);
435   run;

Surprise!
x=0100001001101000011100100111110011011010000000000010011001100110
y=0100001001101000011100100111110011011010000000000010011001100111

(using Windows SAS 9.4M5).

Here, the internal representation of x is mathematically correct, i.e., closer to the theoretical exact representation, which repeats the 4-digit pattern "0011" (occurring three times in the representation of x, followed by the last zero) infinitely often. The last bit (1) of the internal representation of y is actually the result of incorrectly rounding up.

Translated back to the decimal system, the two internal representations look like this:

x=840000000001.199951171875 
y=840000000001.2000732421875

So we can see that the precision is clearly sufficient to distinguish either of these numbers from, say, 840000000001.0. The situation with the numbers close to 8.4E-11 is quite similar because of the number of significant digits. The eleven leading zeros in the decimal representation don't really matter (only a little bit if you try and enter them in a numeric literal, where, again, the internal representation may differ from that of the literal in scientific notation).

View solution in original post

PaigeMiller · Posted 07-01-2024 09:58 AM

SAS cannot represent numbers exactly that are more than about 15 significant digits. See:

Machine Precision

Machine Precision part 2

--
Paige Miller

ChrisNZ · Posted 07-01-2024 07:51 PM

SAS cannot represent numbers exactly that are more than about 15 significant digits.

I am within that precision range here.