Hello!
I want to create the timestamp "31DEC9999:23:59:59.999999" with this code:
data _NULL_;
without_decimals = dhms('31DEC9999'D,23,59,59);
with_decimals = sum(dhms('31DEC9999'D,23,59,59) , 0.999999 );
put without_decimals= datetime25.6;
put with_decimals = datetime25.6;
run;
However, in the Log I get this:
without_decimals=31DEC9999:23:59:59.000000
with_decimals=01JAN****:00:00:00.000000
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
Why do I not get the expected "31DEC9999:23:59:59.999999" ?
Hello @FK1,
In short: The reason is a rounding error. Assuming that your operating system is Windows or Unix, here is what happens in detail:
The integer
without_decimals=253717747199
has a 64-bit binary floating-point representation (as can be seen with the BINARY64. format) of
0100001001001101100010010110000011001010111111111000000000000000
This corresponds to writing the integer as 1.110110001... * 2**37 in the binary system.
Similarly, the decimals 0.999999 are internally represented as
0011111111101111111111111111110111100111001000010000101111101001
which corresponds to 1.111111111...* 2**-1.
For the addition of these two numbers the smaller number, 0.999999, must use the exponent (37) of the larger number. Therefore, the mantissa must be shifted to the right by 37−(−1)=38 digits (and the "implied bit" appears):
010000100100000000000000000000000000000000000000011111111111111111110111100111001000010000101111101001
The light blue bits in italics are then outside the processor's 80-bit precision, but the relevant rounding error occurs when the sum is rounded to 52 mantissa bits:
0100001001001101100010010110000011001010111111111000000000000000 010000100100000000000000000000000000000000000000011111111111111111... 010000100100110110001001011000001100101011111111111111111111111111...
Obviously, the 52nd mantissa bit must be rounded up because of the red 1s that follow. The resulting internal representation
0100001001001101100010010110000011001011000000000000000000000000
is that of the integer 253717747200, i.e., the datetime value corresponding to 01 Jan 10000 00:00:00.
So, in other words, due to the magnitude of datetime values from years like 9999 the precision of the 64-bit internal representation is insufficient to reflect tiny differences like a millionth of a second. In fact, the least significant bit of such datetime values has a place value of 2**(37−52) = 2**−15 = 3.05...E−5, i.e., about 30 millionths of a second. For practical applications this should be sufficient.
The problem is that the precision of numbers in SAS is not infinite. Furthermore, the numbers are base 2 floating points, not base 10.
A SAS datetime value is internally stored as the number of seconds since 01JAN1960. When you use a datetime value at 31DEC9999, that's an awful lot of seconds - actually, the number of seconds has 12 digits. The maximum precision of SAS numbers is about 15 decimals, so you cannot expect to get datetime values with 6 decimals of precision when the number of seconds has 12 digits.
If you add .999 instead of .999999 to your datetime value, you will get "31DEC9999:23:59:59.998993", so the last decimal value is not completely precise. You will get your 6 decimals right up to and after 31DEC2099, but not 8000 years later.
It looks like more than 3 decimal digits are rounded, plus none exact representation of decimal values in binary system:( https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lrcon/p0ji1unv6thm0dn1gp4t01a1u0g6.htm ) :
data _NULL_;
do _BY_ = 0.999999, 0.99999, 0.9999, 0.999, 0.99, 0.9;
do without_decimals = dhms('31DEC9999'D,23,59,59) by _BY_ while(i<3);
put _BY_= without_decimals= datetime25.6 without_decimals=best32.12;
i + 1;
end;
put;
i=0;
end;
run;
Log shows:
_BY_=0.999999 without_decimals=31DEC9999:23:59:59.000000 without_decimals=253717747199 _BY_=0.999999 without_decimals=01JAN****:00:00:00.000000 without_decimals=253717747200 _BY_=0.999999 without_decimals=01JAN****:00:00:01.000000 without_decimals=253717747201 _BY_=0.99999 without_decimals=31DEC9999:23:59:59.000000 without_decimals=253717747199 _BY_=0.99999 without_decimals=01JAN****:00:00:00.000000 without_decimals=253717747200 _BY_=0.99999 without_decimals=01JAN****:00:00:01.000000 without_decimals=253717747201 _BY_=0.9999 without_decimals=31DEC9999:23:59:59.000000 without_decimals=253717747199 _BY_=0.9999 without_decimals=31DEC9999:23:59:59.999908 without_decimals=253717747199.999 _BY_=0.9999 without_decimals=01JAN****:00:00:00.999817 without_decimals=253717747200.999 _BY_=0.999 without_decimals=31DEC9999:23:59:59.000000 without_decimals=253717747199 _BY_=0.999 without_decimals=31DEC9999:23:59:59.998993 without_decimals=253717747199.999 _BY_=0.999 without_decimals=01JAN****:00:00:00.997986 without_decimals=253717747200.998 _BY_=0.99 without_decimals=31DEC9999:23:59:59.000000 without_decimals=253717747199 _BY_=0.99 without_decimals=31DEC9999:23:59:59.989990 without_decimals=253717747199.99 _BY_=0.99 without_decimals=01JAN****:00:00:00.979980 without_decimals=253717747200.98 _BY_=0.9 without_decimals=31DEC9999:23:59:59.000000 without_decimals=253717747199 _BY_=0.9 without_decimals=31DEC9999:23:59:59.899994 without_decimals=253717747199.9 _BY_=0.9 without_decimals=01JAN****:00:00:00.799988 without_decimals=253717747200.8
Bart
Hello @FK1,
In short: The reason is a rounding error. Assuming that your operating system is Windows or Unix, here is what happens in detail:
The integer
without_decimals=253717747199
has a 64-bit binary floating-point representation (as can be seen with the BINARY64. format) of
0100001001001101100010010110000011001010111111111000000000000000
This corresponds to writing the integer as 1.110110001... * 2**37 in the binary system.
Similarly, the decimals 0.999999 are internally represented as
0011111111101111111111111111110111100111001000010000101111101001
which corresponds to 1.111111111...* 2**-1.
For the addition of these two numbers the smaller number, 0.999999, must use the exponent (37) of the larger number. Therefore, the mantissa must be shifted to the right by 37−(−1)=38 digits (and the "implied bit" appears):
010000100100000000000000000000000000000000000000011111111111111111110111100111001000010000101111101001
The light blue bits in italics are then outside the processor's 80-bit precision, but the relevant rounding error occurs when the sum is rounded to 52 mantissa bits:
0100001001001101100010010110000011001010111111111000000000000000 010000100100000000000000000000000000000000000000011111111111111111... 010000100100110110001001011000001100101011111111111111111111111111...
Obviously, the 52nd mantissa bit must be rounded up because of the red 1s that follow. The resulting internal representation
0100001001001101100010010110000011001011000000000000000000000000
is that of the integer 253717747200, i.e., the datetime value corresponding to 01 Jan 10000 00:00:00.
So, in other words, due to the magnitude of datetime values from years like 9999 the precision of the 64-bit internal representation is insufficient to reflect tiny differences like a millionth of a second. In fact, the least significant bit of such datetime values has a place value of 2**(37−52) = 2**−15 = 3.05...E−5, i.e., about 30 millionths of a second. For practical applications this should be sufficient.
Thank you all for your very helpful explanations. Special Thanks to @FreelanceReinh for this indepth explanation!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.