BookmarkSubscribeRSS Feed
mkeintz
PROC Star

While computer science folks learn this early in their studies, this note describes what for some will be a non-intuitive consequence of finite numeric precision in digital computing.  It is NOT a SAS issue – it’s a digital computing issue.

 

The largest consecutive integer exactly represented in floating point storage (see "Largest Integer Represented Exactly?" - NOT Exactly) as used by SAS on Windows and many other machines is 9,007,199,254,740,992, which can be generated by the CONSTANT function:

 

     A=constant(‘exactint’,8);      /*largest consecutive integer for 8-byte real number */

 

But as the link notes, there are lots of non-consecutive integers greater than A which are also accurately stored.  For instance, every even number between A and 2*A, every 0mod4 integer between 2*A and 4*A, etc. 

Edited note: In fact, there are only integers above A, with increasingly spaced integers.  (And there are only integers between A down to 0.5*A, with no integers skipped).

 

This means that the expression A+1 will not result in 9,007,199,254,740,993, because that number can’t be represented.  Instead the arithmetic implementation will generate 9,007,199,254,740,992 – the original A.  And so will 1+A.   At least in this case A+B=B+A.  Both of them have the same “rounding error”.

 

So what about adding 2 to A?   Both A+2 and 2+A generate 9,007,199,254,740,994 – as should be intuitively expected – no rounding error.  And of course A+B=B+A.

 

But consider A+B+C versus C+B+A.

 

  • A+1+1  generates 9,007,199,254,740,992,
    while
  • 1+1+A, generates 9,007,199,254,740,994

This is because A+1+1 is processed as

  • A+1+1 --> (A+1)+1 --> A+1 --> A = 9,007,199,254,740,992
    while
  • 1+1+A --> (1+1)+A --> 2+A = 9,007,199,254,740,994

 

Code to illustrate this follows:

 

 

data _null_;
  A=constant('exactint'); B=1; C=1; 
  put (A B C) (= +3 );

  sum_AB=sum(A,B);
  sum_BA=sum(B,A);
  put / '*** SUM(A,B) equals SUM(B,A) ***  ' / sum_AB=  / sum_BA=;

  sum_ABC=sum(A,B,C);
  sum_CBA=sum(C,B,A);

  put / '*** But SUM(A,B,C) need NOT equal SUM(B,C,A) *** ' 
     / sum_ABC= / sum_CBA=;

  format A: sum_: comma22.0;

run;

 

This is a trivial example of why generating summary statistics from a given dataset can change when that data set is sorted.  The most "accurate" way to get, say, a sum would be to sort the data in ascending absolute value order.  But you'd have to have some very pathological data (like mixing values such as 1 and 9,007,199,254,740,992) to get meaningful differences.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
5 REPLIES 5
ChrisNZ
Tourmaline | Level 20

> But you'd have to have some very pathological data (like mixing values such as 1 and 9,007,199,254,740,992) to get meaningful differences.

Do you have to have to have some pathological OCD issue to think of this post? 😉  Interesting, and obvious when well explained as you did. 😁

mkeintz
PROC Star

@ChrisNZ wrote:

> But you'd have to have some very pathological data (like mixing values such as 1 and 9,007,199,254,740,992) to get meaningful differences.

Do you have to have to have some pathological OCD issue to think of this post? 😉  


Actually I've always found plain vanilla OCD to be sufficient.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ChrisNZ
Tourmaline | Level 20

> Actually I've always found plain vanilla OCD to be sufficient.

Fair enough. And agree. 🙂

FreelanceReinh
Jade | Level 19

Thanks, @mkeintz. This is an instructive example where it's easy to see how the differences come about.

 

Here's another example (using SAS 9.4 under Windows) illustrating your point with "real-world" data:

data test;
set sashelp.class;
bmi=round(703*weight/height**2, .1);
run;

proc transpose data=test out=bmi prefix=BMI_;
var bmi;
id name;
run;

data sums;
set bmi(keep=BMI_Alfred BMI_Carol BMI_Janet);
array b[3] BMI:;
sum_ACJ=b[1]+b[2]+b[3];
sum_CAJ=b[2]+b[1]+b[3];
sum_AJC=b[1]+b[3]+b[2];
sum_JAC=b[3]+b[1]+b[2];
sum_CJA=b[2]+b[3]+b[1];
sum_JCA=b[3]+b[2]+b[1];
format s: hex16.;
run;

Result:

 BMI_      BMI_     BMI_
Alfred    Carol    Janet             sum_ACJ             sum_CAJ             sum_AJC

 16.6      18.3     20.2    404B8CCCCCCCCCCE    404B8CCCCCCCCCCE    404B8CCCCCCCCCCC


         sum_JAC             sum_CJA             sum_JCA

404B8CCCCCCCCCCC    404B8CCCCCCCCCCD    404B8CCCCCCCCCCD

As soon as non-integer values are involved (even with only a single decimal place, except .5), there's a substantial risk of rounding errors in calculations.

 

Since "A+B always equals B+A," we can expect sum_ACJ=sum_CAJ, sum_AJC=sum_JAC and sum_CJA=sum_JCA, but there's no guarantee for more equalities. Indeed, three different sums occur in the example above, which is just one of dozens of similar cases that occur within the BMI values of SASHELP.CLASS.

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1396 views
  • 7 likes
  • 4 in conversation