I want to subtract/ add the smallest possible value (epsilon) to a natural number (here: 4) saved as numeric with format "best." in order to get a value non-equal to it. I found that constant('maceps') gives me that minimal floating number.
However the SAS internal representation between the natural number and the altered natural number (by + (plus) or - (minus) epsilon) seems to be identical. See the code and its log output below.
Is there an automatic variable that I should use instead that is different to constant('maceps')?
Or should the natural number be formatted in a different way (float4.)?
The solution to this post would be:
data _find_epsilon;
epsilon = constant('maceps'); /* Smallest possible floating point value of computer: 2.220446E-16 */
num_const = 4;
num_const_gt = 4 + epsilon; /* Add epsilon [+] */
num_const_lt = 4 - epsilon; /* Subtract epsilon [-] */
put (num_const num_const_gt num_const_lt epsilon) (=);
/* Check, if values match. */
if (num_const = num_const_gt) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
if (num_const = num_const_lt) then put "EQUAL: num_const (4) = num_const (4 - epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 - epsilon)";
if ((num_const = num_const_gt) and (num_const = num_const_lt)) then put "EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 + epsilon) OR num_const (4) <--> num_const (4 - epsilon)";
format num_const best.;
run;
Log-Output (edited for readability):
num_const = 4
num_const_gt = 4
num_const_lt = 4
epsilon = 2.220446E-16 EQUAL: num_const (4) = num_const (4 + epsilon) EQUAL: num_const (4) = num_const (4 - epsilon) EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon) NOTE: The data set WORK._FIND_EPSILON has 1 observations and 4 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.02 seconds
By adding 4 (which is 2 ** 2) to maceps, you "push" your epsilon out of the available precision by 2 binary digits; you need to correct for this, and adapt your epsilon to the base value:
data _find_epsilon;
epsilon = 4*constant('maceps'); /* Smallest possible floating point value of computer: 2.220446E-16 */
num_const = 4;
num_const_gt = num_const + (epsilon * 2 ** log2(num_const)); /* Add epsilon [+] */
num_const_lt = num_const - (epsilon * 2 ** log2(num_const)); /* Subtract epsilon [-] */
put (num_const num_const_gt num_const_lt epsilon) (=);
/* Check, if values match. */
if (num_const = num_const_gt) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
if (num_const = num_const_lt) then put "EQUAL: num_const (4) = num_const (4 - epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 - epsilon)";
if ((num_const = num_const_gt) and (num_const = num_const_lt)) then put "EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 + epsilon) OR num_const (4) <--> num_const (4 - epsilon)";
format num_const best.;
run;
If using 4 * epsilon then the values seem to be interpreted as different from SAS.
data _get_epsilon;
epsilon = constant('maceps'); /* Smallest possible floating point value of computer: 2.220446E-16 */
num_const = 4;
/* Add epsilon i-times to see when SAS recognizes a difference. */
do i=1 by 1 until (num_const ne num_const_gt);
num_const_gt = num_const + i * epsilon;
i+1;
end;
/* Print values to log (especially "i", the minimal multiplier for epsilon to get a difference). */
put (num_const num_const_gt epsilon i) (=);;
/* Verify, that a difference was found by SAS. */
if (num_const = num_const + i * epsilon) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
format num_const best.;
run;
Log Output (edited for readability):
num_const = 4 num_const_gt = 4 epsilon = 2.220446E-16 i = 4 NOT EQUAL: num_const (4) <--> (4 + epsilon) NOTE: The data set WORK._GET_EPSILON has 1 observations and 4 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.00 seconds
While this solves the problem in a way I still wonder:
Just for documentation purposes (and hopefully a help to others):
I tried to use function floor to round the result "downwards". But in the documentation (see here for floor and floorz respecitvely) for function "floor" this works only, if the argument is within range 1E-12:
My epsilon is smaller (even if multiplied by 4), so this does not get the desired result.
So instead I used the function "floorz" which worked in my situation.
By adding 4 (which is 2 ** 2) to maceps, you "push" your epsilon out of the available precision by 2 binary digits; you need to correct for this, and adapt your epsilon to the base value:
data _find_epsilon;
epsilon = 4*constant('maceps'); /* Smallest possible floating point value of computer: 2.220446E-16 */
num_const = 4;
num_const_gt = num_const + (epsilon * 2 ** log2(num_const)); /* Add epsilon [+] */
num_const_lt = num_const - (epsilon * 2 ** log2(num_const)); /* Subtract epsilon [-] */
put (num_const num_const_gt num_const_lt epsilon) (=);
/* Check, if values match. */
if (num_const = num_const_gt) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
if (num_const = num_const_lt) then put "EQUAL: num_const (4) = num_const (4 - epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 - epsilon)";
if ((num_const = num_const_gt) and (num_const = num_const_lt)) then put "EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 + epsilon) OR num_const (4) <--> num_const (4 - epsilon)";
format num_const best.;
run;
The issue isn't really the value of the constant. That is what it is. The issue arises with precision of storage when used in calculations.
When you say the lower limit of machine precision that's one thing. But as soon as you add that 4 you have occupied part of the storage used to hold the value and don't have enough bits for the result.
You could try BEST32. but no guarantees for display though. I would force 16 decimals with an F format.
Try a value around 4.5E-16 and decrementing in a loop by that constant and see what happens.
Hi @left,
Sorry for being late to the discussion. To check whether a number in SAS is really the closest possible number less than or greater than a given number, it's best to use a format showing the internal binary floating-point representation, e.g., BINARY64. or HEX16. You can even start with such a formatted value to determine the two numbers in question. Thus you have full control over every bit. In your example (natural numbers) things are somewhat simpler than in the general case, especially if the integers are not too large. For example, integers up to 1E12=1,000,000,000,000 (and also many larger integers) will be processed correctly by the code shown below.
/* Create a sample of positive integers n for demonstration */
data have;
do n=1 to 5, 15 to 17, 1023 to 1025, 1e12-1, 1e12;
output;
end;
run;
/* Determine the closest numbers n_lt < n and n_gt > n */
data want(drop=e m);
set have;
length e $3;
h=put(n,hex16.);
e=h;
m=input(substr(h,4),hex13.);
n_gt=input(e||put(m+1,hex13.),hex16.);
if m then n_lt=input(e||put(m-1,hex13.),hex16.);
else n_lt=input(put(input(e,hex3.)-1,hex3.)||put(m-1,hex13.),hex16.);
del_gt=n_gt-n;
del_nt=n-n_lt;
format n_: hex16. del: e12.;
run;
Result with SAS 9.4M5 under Windows:
n h n_gt n_lt del_gt del_nt 1 3FF0000000000000 3FF0000000000001 3FEFFFFFFFFFFFFF 2.22045E-16 1.11022E-16 2 4000000000000000 4000000000000001 3FFFFFFFFFFFFFFF 4.44089E-16 2.22045E-16 3 4008000000000000 4008000000000001 4007FFFFFFFFFFFF 4.44089E-16 4.44089E-16 4 4010000000000000 4010000000000001 400FFFFFFFFFFFFF 8.88178E-16 4.44089E-16 5 4014000000000000 4014000000000001 4013FFFFFFFFFFFF 8.88178E-16 8.88178E-16 15 402E000000000000 402E000000000001 402DFFFFFFFFFFFF 1.77636E-15 1.77636E-15 16 4030000000000000 4030000000000001 402FFFFFFFFFFFFF 3.55271E-15 1.77636E-15 17 4031000000000000 4031000000000001 4030FFFFFFFFFFFF 3.55271E-15 3.55271E-15 1023 408FF80000000000 408FF80000000001 408FF7FFFFFFFFFF 1.13687E-13 1.13687E-13 1024 4090000000000000 4090000000000001 408FFFFFFFFFFFFF 2.27374E-13 1.13687E-13 1025 4090040000000000 4090040000000001 409003FFFFFFFFFF 2.27374E-13 2.27374E-13 999999999999 426D1A94A1FFE000 426D1A94A1FFE001 426D1A94A1FFDFFF 1.22070E-04 1.22070E-04 1E12 426D1A94A2000000 426D1A94A2000001 426D1A94A1FFFFFF 1.22070E-04 1.22070E-04
The ending hexadecimal digits in the internal representations of n (shown in h), n_gt and n_lt, "...000", "...001" and "...FFF", resp., indicate that we've really found the "neighboring" numbers of n among the internal representations. Note that the "deltas," i.e., the two differences del_gt=n_gt-n and del_nt=n-n_lt are mostly equal, but not if n is a power of 2 (in which case del_gt=2*del_nt). This is because in the latter case the exponent (cf. variable e in the code) changes in the transition from n to n_lt, as does the place value of the least significant bit. It's this special case that requires the ELSE statement in the code.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.