Re: Expected numeric precision behaviour or unexpected issue? - Page 2

ChrisNZ · Posted 07-31-2016 09:21 PM

After many time-consuming message exchanges, I am told by tech support that reading

X=0.5000000000000000000000000 (that's L=23 in our discussion)

as something else than 0.5 is expected behaviour.

If someone else wants to try and report this...

Sigh...

High-Performance SAS Coding - Third Edition

Nancy05 · Posted 07-31-2016 11:12 PM

I tried your code, and got this result:

V NB
0 .

I am running on SAS EG 7.12 HF3 (7.100.34000) (64-bit) on window 8.0.

ChrisNZ · Posted 08-01-2016 07:33 PM

Will no one try to push this?

If this is expected behaviour, the expectations are very low, and I hope I'll be retired before they're met... 🙂

High-Performance SAS Coding - Third Edition

Tom · Posted 08-01-2016 08:39 PM

Did you open a ticket with SAS support? What did they say?

ChrisNZ · Posted 08-01-2016 09:55 PM

I was told by tech support that reading

X=0.5000000000000000000000000 (that's L=23 in our discussion)

as not 0.5 is expected behaviour.

High-Performance SAS Coding - Third Edition

ChrisNZ · Posted 08-01-2016 10:43 PM

Just to prove my point (not to bring criticism, we are not in a blaming game) as I reckon that's hard to believe, here is my last email after a long series detailing the issue and getting nowhere:

1) If I run this under Windows:

data T;

NB=O.5OOOOOOOOOOOOOOOOOOOOOOOO; (replace with zeros)

run;

SAS will not store the correct number.

2) If I run the same code on a different platform, or if I add or remove zeros, the correct number is stored.

How 1) is acceptable is beyond me. How 1) and 2) are acceptable even more so.

If you somehow think this is expected behaviour, please tell me and close the track.

and the nonsensical reply:

Speaking to the developer, this is expected behaviour due to the algorithms being used to store numbers that are using more bytes than can be accurately stored.

High-Performance SAS Coding - Third Edition

ChrisNZ · Posted 08-02-2016 05:42 PM

So no one to pick up the ball?

I'll open a suggestion entry if no one is interested. I can't let this die off like this.

High-Performance SAS Coding - Third Edition

mike_jones_SAS · Posted 08-05-2016 11:38 AM

I support this routine on SAS for Windows (both 32-bit and x64 Windows).

This issue is not OS related. The issue pertains to floating point representation on the x64 processors. IEEE floating point representation can only represent 15 digits of accuracy. Sometimes you can see 16 digits of accuracy. In the initial example, the 15 digits of accuracy is easily exceeded even though there are cases where the result is the exact representation for 0.5.

The routine used to compute the result is different on Windows than on any other host (Linux, UNIX, AIX, etc.). The routine is written in x64 assembly to maximize performance since almost every alpha-numeric in SAS is processed by this routine. The algorithm used has existed since I wrote the routine back in v6.03 SAS for PC-DOS (16-bit). Only the instruction set has changed.

Here’s an example to show this issue:

33 data _null_;

34 x=0.5000000000000000000000000;

35 y=0.5000000000000000000000001;

36 put x= hex16. y= hex16.;

37 run;

x=3FDFFFFFFFFFFFFF y=3FDFFFFFFFFFFFFF

NOTE: DATA statement used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

In this case, the result is slightly less than 0.5 for x as well as y. However, y should be different. A comparison of these two variables would be equal but should not as we know.

Here’s another example to show this issue:

38 data _null_;

39 x=0.500000000000000000000000;

40 y=0.500000000000000000000001;

41 put x= hex16. y= hex16.;

42 run;

x=3FE0000000000000 y=3FE0000000000000

NOTE: DATA statement used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

In this case, the result is exactly 0.5 for x as well as y by reducing the number of trailing zeros. However, y should be different. A comparison of these two variables would be equal again but should not as we know.

Now, let’s reduce the trailing zeros to where the least significant digit can be seen in the result.

58 data _null_;

59 x=0.5000000000000000;

60 y=0.5000000000000001;

61 put x= hex16. y= hex16.;

62 run;

x=3FE0000000000000 y=3FE0000000000001

NOTE: DATA statement used (Total process time):

real time 0.00 seconds

cpu time 0.01 seconds

A comparison for x and y in this case would be not equal and rightfully so.

When comparing floating point numbers, the COMPFUZZ function is recommended.

http://support.sas.com/documentation/cdl/en/lefunctionsref/67960/HTML/default/viewer.htm#p0ifledavu3...

ChrisNZ · Posted 08-07-2016 06:40 AM

Thank you for your reply @mike_jones_SAS, I much appreciate your taking the time to provide your expert input here.

I fail to see how the floating point representation on the x64 CPU or more generally IEEE floating point representation have anything to do with this.

SAS Linux on x64 and doesn't have the issue either apparently, according to @Tom.

POWER processors also follow the IEEE Standard for Floating-Point arithmetic and yet SAS running on this platform does not exhibit this behaviour.

The floating point representation for 0.5 is 3FE0000000000000.

And for some reason in a few specific, reproducible cases, SAS reads this number wrongly. SAS could easily store the correct value if it tried. But SAS chooses to store an incorrect value instead, because that's what it's read.

SAS reads wrongly:

1- On Windows only

2- Depending on random changes in the number of non-significant zeros.

0.50000000000000000000000000 is read correctly by the SAS interpreter

0.5000000000000000000000000 is not read correctly by the SAS interpreter

0.500000000000000000000000 is read correctly by the SAS interpreter

All three values are hex value 3FE0000000000000, easily stored in IEEE floating point representation.

All values are read correctly on all SAS platforms, except for 1)Windows 2) if there are 24 trailing zeros. 23 and 25 are fine.

Why?

High-Performance SAS Coding - Third Edition

PGStats · Posted 08-07-2016 10:33 PM

Hi @mike_jones_SAS, it's great having you in this discussion.

There is a qualitative difference between reading the same value for numbers that differ by less than a small amount and reading different values for numbers that are the same. Further, I would expect that numbers that round to the same value would be represented by the same value.
Take Pi for example. Whether I expand it to 20 or 30 decimals, it should always be equal to CONSTANT("PI") in a 15-16 decimal representation. But it isn't:

63   data _null_;
64   decimals = 10;
65   do pi =
66        3.1415926536
67       ,3.14159265359
68       ,3.141592653590
69       ,3.1415926535898
70       ,3.14159265358979
71       ,3.141592653589793
72       ,3.1415926535897932
73       ,3.14159265358979324
74       ,3.141592653589793238
75       ,3.1415926535897932385
76       ,3.14159265358979323846
77       ,3.141592653589793238463
78       ,3.1415926535897932384626
79       ,3.14159265358979323846264
80       ,3.141592653589793238462643
81       ,3.1415926535897932384626434
82       ,3.14159265358979323846264338
83       ,3.141592653589793238462643383
84       ,3.1415926535897932384626433833
85       ,3.14159265358979323846264338328
86       ,3.141592653589793238462643383280
87       ,3.1415926535897932384626433832795
88       ,3.14159265358979323846264338327950
89       ,3.141592653589793238462643383279503
90       ,3.1415926535897932384626433832795029
91       ,3.14159265358979323846264338327950288
92       ,3.141592653589793238462643383279502884
93       ,3.1415926535897932384626433832795028842
94       ,3.14159265358979323846264338327950288420
95       ,3.141592653589793238462643383279502884197
96       ,3.1415926535897932384626433832795028841972
97       ,3.14159265358979323846264338327950288419717
98       ,3.141592653589793238462643383279502884197169
99       ,3.1415926535897932384626433832795028841971694
100      ,3.14159265358979323846264338327950288419716939
101      ,3.141592653589793238462643383279502884197169400
102      ,3.1415926535897932384626433832795028841971693994
103      ,3.14159265358979323846264338327950288419716939938
104      ,3.141592653589793238462643383279502884197169399375
105      ,3.1415926535897932384626433832795028841971693993751
106      ,3.14159265358979323846264338327950288419716939937510;
107      eq = PI = constant("PI");
108      put decimals= eq= PI= hex16.;
109      decimals + 1;
110      end;
111  run;

decimals=10 eq=0 pi=400921FB544486E0
decimals=11 eq=0 pi=400921FB54442EEA
decimals=12 eq=0 pi=400921FB54442EEA
decimals=13 eq=0 pi=400921FB54442D28
decimals=14 eq=0 pi=400921FB54442D11
decimals=15 eq=1 pi=400921FB54442D18
decimals=16 eq=1 pi=400921FB54442D18
decimals=17 eq=1 pi=400921FB54442D18
decimals=18 eq=1 pi=400921FB54442D18
decimals=19 eq=1 pi=400921FB54442D18
decimals=20 eq=1 pi=400921FB54442D18
decimals=21 eq=1 pi=400921FB54442D18
decimals=22 eq=1 pi=400921FB54442D18
decimals=23 eq=0 pi=400921FB54442D19
decimals=24 eq=1 pi=400921FB54442D18
decimals=25 eq=1 pi=400921FB54442D18
decimals=26 eq=1 pi=400921FB54442D18
decimals=27 eq=1 pi=400921FB54442D18
decimals=28 eq=0 pi=400921FB54442D19
decimals=29 eq=0 pi=400921FB54442D19
decimals=30 eq=1 pi=400921FB54442D18
decimals=31 eq=0 pi=400921FB54442D19
decimals=32 eq=0 pi=400921FB54442D19
decimals=33 eq=0 pi=400921FB54442D19
decimals=34 eq=0 pi=400921FB54442D19
decimals=35 eq=0 pi=400921FB54442D19
decimals=36 eq=0 pi=400921FB54442D19
decimals=37 eq=0 pi=400921FB54442D19
decimals=38 eq=0 pi=400921FB54442D19
decimals=39 eq=0 pi=400921FB54442D19
decimals=40 eq=0 pi=400921FB54442D19
decimals=41 eq=0 pi=400921FB54442D19
decimals=42 eq=0 pi=400921FB54442D19
decimals=43 eq=0 pi=400921FB54442D19
decimals=44 eq=0 pi=400921FB54442D19
decimals=45 eq=0 pi=400921FB54442D19
decimals=46 eq=0 pi=400921FB54442D19
decimals=47 eq=0 pi=400921FB54442D19
decimals=48 eq=0 pi=400921FB54442D19
decimals=49 eq=0 pi=400921FB54442D19
decimals=50 eq=0 pi=400921FB54442D19

PG

Kurt_Bremser · Posted 08-08-2016 02:21 AM

@mike_jones_SAS Reading slightly different input values to the same stored value because of numeric precision is perfectly understandable.

Reading THE SAME VALUE different just because of the number of insignificant zeros that follow it, and that in only one platform (Windows) that uses the same hardware as another platform that works (Linux x64), and a real-number representation that is common to almost all 64-bit processors today (which also do not display the same error), points to a fault in the Windows code of SAS.

If you find where it happens, and find it is too hard to fix because you would have to basically do MS's work by fixing Windows itself, put that into a message in the SAS knowledge base, so everybody knows it may happen and it is as it is.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

mike_jones_SAS · Posted 08-10-2016 02:44 PM

There is an easy workaround for this issue. Any changes to the routine for Windows x64 could not only lead to poorer performance but also has the possibility to introduce other unintended side effects or errors in other computations.

SAS Technical Support will pursue a SASnote (or SAS Knowledge Base Message) to document this behavior.

ChrisNZ · Posted 08-10-2016 10:26 PM

Glad this is finally recognized as an issue!

However, it took several weeks of discussing the problem here (2 threads, 4 pages) and with tech support, receiving shallow replies, spending hours to demonstrate and argue, before something is finally done. And that's just for a Usage Note, not a fix, so the issue remains.

Once again, SAS should eagerly be seeking such feed-back. It should not be so hard and time-consuming. Similarly, there are other issues that were never accepted by tech support for no valid reason and will remain as software quirks/defects/issues.

@mike_jones_SAS Please point us to the UN when it is created, I am curious to know what the contents will be. Thanks again for your involvement.

High-Performance SAS Coding - Third Edition

Kurt_Bremser · Posted 08-11-2016 01:56 AM

It is sometimes funny to watch how big software companies are glued to the status quo, rejecting easily made changes that would get rid of major problems. My pet peeve is the faulty date design that MS copied from Lotus 1-2-3 into Excel and still keeps, although the open source office suites have already shown the correct way to deal with the problem. (For those who don't know it yet: try to enter 29-02-1900 in Excel and in OpenOffice Calc)

That the fault actually seems to lead into the depths of Windows itself once again confirms my stance that Windows is the worst possible production platform available today.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

ChrisNZ · Posted 09-15-2016 07:01 PM

Has the SAS Note been created?

High-Performance SAS Coding - Third Edition

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away