BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Zaghini
Fluorite | Level 6

Hi, I have a problem with scientific notation.

My dataset prova has 9 rows and it is so composed:

Vincolo

bil22

bil23

bil24

BIL031 - BIL031

80147,35

80147,35

80147,35

BIL031 - BIL031

0

0

0

BIL031 - BIL031

0

0

0

BIL031 - BIL031

0

0

0

BIL031 - BIL031

0

0

0

BIL031 - BIL031

-23793,09

-23793,09

-23793,09

BIL031 - BIL031

-9750,34

-9750,34

-9750,34

BIL031 - BIL031

-22784,59

-22784,59

-22784,59

BIL031 - BIL031

-23819,33

-23819,33

-23819,33

 

Variables bil22, bil23 and bil24 are created with:

data prova;

set vincolati;

bil22=round(bilancio2022,.01);

bil23=round(bilancio2023,.01);

bil24=round(bilancio2024,.01);

run;

 

when I summarize  

proc sql;
CREATE TABLE vinc_tot AS
SELECT
vincolo, sum(Bil22) as vincsum22, sum(Bil23) as vincsum23, sum(Bil24) as vincsum24
FROM prova

GROUP BY vincolo; 
QUIT;

 

I expect a risult like this:

vincolo

vincsum22

vincsum23

vincsum24

BIL031 - BIL031

0

0

0

 

But I obtain this:

vincolo

vincsum22

vincsum23

vincsum24

BIL031 - BIL031

3.637979E-12

3.637979E-12

3.637979E-12

 

As if in Bil22, Bil23 and Bil24 (rounded .01 !!! ) there where several decimals.

Someone can help me?

Thank you

Massimo

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @Zaghini and welcome to the SAS Support Communities!

 

I see that @Rick_SAS has already provided the explanation and solution. Let me just add a few details specific to your example.

 

You have encountered a numeric representation issue. Computers store numeric values in the binary system. Most of the numbers with one or more decimals (in the familiar decimal system) have infinitely (!) many digits when converted to the binary system. Similar to 1/3=0.333333... in the decimal system they are periodic fractions. SAS uses a 64-bit floating-point representation, which is rounded if the exact number would require more than 64 bits, let alone infinitely many, in that representation. Hence, most numbers with one or more decimals and in fact all numbers except 0 in your sample data contain a small rounding error that you don't see with standard SAS formats.

 

Example:

Your first value, 80147.35, written in the binary system, looks like this:

10011100100010011.01011001100110...

The pattern "0110" (highlighted in boldface) is repeated periodically.

 

Compare this to the internal 64-bit floating-point representation used by SAS (here: under Windows or Unix; see Numerical Accuracy in SAS Software for the documentation).

0100000011110011100100010011010110011001100110011001100110011010

 

You can recognize the blue digits of the integer part (the first "1" is the so-called "implied bit") and the red digits of the fractional part with eight copies of the repeating 4-digit pattern. (The first 12 bits of the internal representation store the sign and the order of magnitude of the number.) The last two digits, 10 (in purple italics), are the result of rounding 01100110... to only two digits. So the number is rounded up and thus a little greater than the exact value 80147.35. Indeed, converting the internal representation back to the decimal system would result in

80147.35000000000582076609134674072265625

which means an unavoidable rounding error of 1.6*2**-38 ≈ 5.82E-12.

 

Now, if you add several numbers with rounding errors (or do other calculations with them), it's likely that the result will be contaminated with a rounding error as well. In your example, the mathematically exact result happens to be 0, so that the rounding error, although as small as 2**-38 ≈ 3.637979E-12, becomes apparent.

 

Rounding the sum with a rounding unit small enough not to influence the results, but greater than the tiny rounding error described above, gives the desired result:

round(sum(Bil22),1e-9) as vincsum22

In your example, knowing that the result can have at most two decimals like all numbers involved, you could also use 0.01=1e-2 as the rounding unit to obtain the same result.

 

Edit:

The other numbers involved in your sum, converted back and forth as above, are:

-23793.09000000000014551915228366851806640625 = -23793.09 - 1.28*2**-43
- 9750.34000000000014551915228366851806640625 = - 9750.34 - 1.28*2**-43
-22784.59000000000014551915228366851806640625 = -22784.59 - 1.28*2**-43
-23819.330000000001746229827404022216796875   = -23819.33 - 1.92*2**-40

In this particular example the rounding error you obtained for the sum equals the sum of the rounding errors of the summands: 1.6*2**-38 - 3*1.28*2**-43 - 1.92*2**-40 = 2**-38. But this is not always the case. The exact rounding error depends on the addition of the internal representations and on the order of the summands, i.e., a+b+c in SAS (and other software) is not necessarily equal to b+c+a, which is particularly problematic in PROC SQL where "order of observations" is not stable in general. Indeed, I get six different sums ranging from -3.63798E-12 over 0 to 1.455192E-11 when I permute the five summands in all possible ways.

View solution in original post

4 REPLIES 4
Rick_SAS
SAS Super FREQ

Yes, it can be very frustrating when you realize that every decimal number cannot be represented exactly as a  finite-precision binary number.

 

In base-10 (decimal), certain numbers such as 1/3 cannot be represented exactly by any finite decimal number. So an expression such as 1/3 + 1/3 + 1/3 becomes (in decimal) 0.33333 + 0.33333 + 0.33333 = 0.99999, when we know that in EXACT arithmetic the answer should be 1.0.

 

The same thing is happening here. Run this DATA step to see:

data Test;
s = sum(1.35, -0.09, -0.34, -0.59, -0.33);
run;

proc print data=Test;
run;

Even though in EXACT arithmetic these numbers sum to 0.0, when you represent the numbers in finite-precision binary, the sum differs from zero by a small amount.

 

One way to handle this issue is to round the FINAL SUM, not the individual elements.

 

By the way, in numerical analysis, this issue is sometimes called "10.0 times 0.1 is hardly ever 1.0"  (Kernighan and Plauger, 1974, The Elements of Programming Style). For another example and a discussion of defensive programming related to finite precision, see "Programming tip: Avoid testing floating-point values for equality".

FreelanceReinh
Jade | Level 19

Hi @Zaghini and welcome to the SAS Support Communities!

 

I see that @Rick_SAS has already provided the explanation and solution. Let me just add a few details specific to your example.

 

You have encountered a numeric representation issue. Computers store numeric values in the binary system. Most of the numbers with one or more decimals (in the familiar decimal system) have infinitely (!) many digits when converted to the binary system. Similar to 1/3=0.333333... in the decimal system they are periodic fractions. SAS uses a 64-bit floating-point representation, which is rounded if the exact number would require more than 64 bits, let alone infinitely many, in that representation. Hence, most numbers with one or more decimals and in fact all numbers except 0 in your sample data contain a small rounding error that you don't see with standard SAS formats.

 

Example:

Your first value, 80147.35, written in the binary system, looks like this:

10011100100010011.01011001100110...

The pattern "0110" (highlighted in boldface) is repeated periodically.

 

Compare this to the internal 64-bit floating-point representation used by SAS (here: under Windows or Unix; see Numerical Accuracy in SAS Software for the documentation).

0100000011110011100100010011010110011001100110011001100110011010

 

You can recognize the blue digits of the integer part (the first "1" is the so-called "implied bit") and the red digits of the fractional part with eight copies of the repeating 4-digit pattern. (The first 12 bits of the internal representation store the sign and the order of magnitude of the number.) The last two digits, 10 (in purple italics), are the result of rounding 01100110... to only two digits. So the number is rounded up and thus a little greater than the exact value 80147.35. Indeed, converting the internal representation back to the decimal system would result in

80147.35000000000582076609134674072265625

which means an unavoidable rounding error of 1.6*2**-38 ≈ 5.82E-12.

 

Now, if you add several numbers with rounding errors (or do other calculations with them), it's likely that the result will be contaminated with a rounding error as well. In your example, the mathematically exact result happens to be 0, so that the rounding error, although as small as 2**-38 ≈ 3.637979E-12, becomes apparent.

 

Rounding the sum with a rounding unit small enough not to influence the results, but greater than the tiny rounding error described above, gives the desired result:

round(sum(Bil22),1e-9) as vincsum22

In your example, knowing that the result can have at most two decimals like all numbers involved, you could also use 0.01=1e-2 as the rounding unit to obtain the same result.

 

Edit:

The other numbers involved in your sum, converted back and forth as above, are:

-23793.09000000000014551915228366851806640625 = -23793.09 - 1.28*2**-43
- 9750.34000000000014551915228366851806640625 = - 9750.34 - 1.28*2**-43
-22784.59000000000014551915228366851806640625 = -22784.59 - 1.28*2**-43
-23819.330000000001746229827404022216796875   = -23819.33 - 1.92*2**-40

In this particular example the rounding error you obtained for the sum equals the sum of the rounding errors of the summands: 1.6*2**-38 - 3*1.28*2**-43 - 1.92*2**-40 = 2**-38. But this is not always the case. The exact rounding error depends on the addition of the internal representations and on the order of the summands, i.e., a+b+c in SAS (and other software) is not necessarily equal to b+c+a, which is particularly problematic in PROC SQL where "order of observations" is not stable in general. Indeed, I get six different sums ranging from -3.63798E-12 over 0 to 1.455192E-11 when I permute the five summands in all possible ways.

Zaghini
Fluorite | Level 6

thank you all @Rick_SAS @Kurt_Bremser @FreelanceReinhard, it is my first time in sas communities. You did a great job! I followed your instructions and it works perfectly.

m.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1999 views
  • 5 likes
  • 4 in conversation