- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I want to subtract/ add the smallest possible value (epsilon) to a natural number (here: 4) saved as numeric with format "best." in order to get a value non-equal to it. I found that constant('maceps') gives me that minimal floating number.
However the SAS internal representation between the natural number and the altered natural number (by + (plus) or - (minus) epsilon) seems to be identical. See the code and its log output below.
Is there an automatic variable that I should use instead that is different to constant('maceps')?
Or should the natural number be formatted in a different way (float4.)?
The solution to this post would be:
- either adapt the code, so epsilon makes the altered values different to the non-altered value.
- provide the smallest number that triggers SAS to distinct between altered and non-altered value.
- If possible being a constant/ automatic variable.
data _find_epsilon;
epsilon = constant('maceps'); /* Smallest possible floating point value of computer: 2.220446E-16 */
num_const = 4;
num_const_gt = 4 + epsilon; /* Add epsilon [+] */
num_const_lt = 4 - epsilon; /* Subtract epsilon [-] */
put (num_const num_const_gt num_const_lt epsilon) (=);
/* Check, if values match. */
if (num_const = num_const_gt) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
if (num_const = num_const_lt) then put "EQUAL: num_const (4) = num_const (4 - epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 - epsilon)";
if ((num_const = num_const_gt) and (num_const = num_const_lt)) then put "EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 + epsilon) OR num_const (4) <--> num_const (4 - epsilon)";
format num_const best.;
run;
Log-Output (edited for readability):
num_const = 4
num_const_gt = 4
num_const_lt = 4
epsilon = 2.220446E-16 EQUAL: num_const (4) = num_const (4 + epsilon) EQUAL: num_const (4) = num_const (4 - epsilon) EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon) NOTE: The data set WORK._FIND_EPSILON has 1 observations and 4 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.02 seconds
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
By adding 4 (which is 2 ** 2) to maceps, you "push" your epsilon out of the available precision by 2 binary digits; you need to correct for this, and adapt your epsilon to the base value:
data _find_epsilon;
epsilon = 4*constant('maceps'); /* Smallest possible floating point value of computer: 2.220446E-16 */
num_const = 4;
num_const_gt = num_const + (epsilon * 2 ** log2(num_const)); /* Add epsilon [+] */
num_const_lt = num_const - (epsilon * 2 ** log2(num_const)); /* Subtract epsilon [-] */
put (num_const num_const_gt num_const_lt epsilon) (=);
/* Check, if values match. */
if (num_const = num_const_gt) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
if (num_const = num_const_lt) then put "EQUAL: num_const (4) = num_const (4 - epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 - epsilon)";
if ((num_const = num_const_gt) and (num_const = num_const_lt)) then put "EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 + epsilon) OR num_const (4) <--> num_const (4 - epsilon)";
format num_const best.;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If using 4 * epsilon then the values seem to be interpreted as different from SAS.
data _get_epsilon;
epsilon = constant('maceps'); /* Smallest possible floating point value of computer: 2.220446E-16 */
num_const = 4;
/* Add epsilon i-times to see when SAS recognizes a difference. */
do i=1 by 1 until (num_const ne num_const_gt);
num_const_gt = num_const + i * epsilon;
i+1;
end;
/* Print values to log (especially "i", the minimal multiplier for epsilon to get a difference). */
put (num_const num_const_gt epsilon i) (=);;
/* Verify, that a difference was found by SAS. */
if (num_const = num_const + i * epsilon) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
format num_const best.;
run;
Log Output (edited for readability):
num_const = 4 num_const_gt = 4 epsilon = 2.220446E-16 i = 4 NOT EQUAL: num_const (4) <--> (4 + epsilon) NOTE: The data set WORK._GET_EPSILON has 1 observations and 4 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.00 seconds
While this solves the problem in a way I still wonder:
- How do I know, if "i" is 4 or another value in other cases?
- What if I ran my code on another machine/ SAS installation?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Just for documentation purposes (and hopefully a help to others):
I tried to use function floor to round the result "downwards". But in the documentation (see here for floor and floorz respecitvely) for function "floor" this works only, if the argument is within range 1E-12:
- "Unlike the FLOOR function, the FLOORZ function uses zero fuzzing. If the argument is within 1E-12 of an integer, the FLOOR function fuzzes the result to be equal to that integer. The FLOORZ function does not fuzz the result. Therefore, with the FLOORZ function you might get unexpected results."
My epsilon is smaller (even if multiplied by 4), so this does not get the desired result.
So instead I used the function "floorz" which worked in my situation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
By adding 4 (which is 2 ** 2) to maceps, you "push" your epsilon out of the available precision by 2 binary digits; you need to correct for this, and adapt your epsilon to the base value:
data _find_epsilon;
epsilon = 4*constant('maceps'); /* Smallest possible floating point value of computer: 2.220446E-16 */
num_const = 4;
num_const_gt = num_const + (epsilon * 2 ** log2(num_const)); /* Add epsilon [+] */
num_const_lt = num_const - (epsilon * 2 ** log2(num_const)); /* Subtract epsilon [-] */
put (num_const num_const_gt num_const_lt epsilon) (=);
/* Check, if values match. */
if (num_const = num_const_gt) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
if (num_const = num_const_lt) then put "EQUAL: num_const (4) = num_const (4 - epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 - epsilon)";
if ((num_const = num_const_gt) and (num_const = num_const_lt)) then put "EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon)";
else put "NOT EQUAL: num_const (4) <--> (4 + epsilon) OR num_const (4) <--> num_const (4 - epsilon)";
format num_const best.;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The issue isn't really the value of the constant. That is what it is. The issue arises with precision of storage when used in calculations.
When you say the lower limit of machine precision that's one thing. But as soon as you add that 4 you have occupied part of the storage used to hold the value and don't have enough bits for the result.
You could try BEST32. but no guarantees for display though. I would force 16 decimals with an F format.
Try a value around 4.5E-16 and decrementing in a loop by that constant and see what happens.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @left,
Sorry for being late to the discussion. To check whether a number in SAS is really the closest possible number less than or greater than a given number, it's best to use a format showing the internal binary floating-point representation, e.g., BINARY64. or HEX16. You can even start with such a formatted value to determine the two numbers in question. Thus you have full control over every bit. In your example (natural numbers) things are somewhat simpler than in the general case, especially if the integers are not too large. For example, integers up to 1E12=1,000,000,000,000 (and also many larger integers) will be processed correctly by the code shown below.
/* Create a sample of positive integers n for demonstration */
data have;
do n=1 to 5, 15 to 17, 1023 to 1025, 1e12-1, 1e12;
output;
end;
run;
/* Determine the closest numbers n_lt < n and n_gt > n */
data want(drop=e m);
set have;
length e $3;
h=put(n,hex16.);
e=h;
m=input(substr(h,4),hex13.);
n_gt=input(e||put(m+1,hex13.),hex16.);
if m then n_lt=input(e||put(m-1,hex13.),hex16.);
else n_lt=input(put(input(e,hex3.)-1,hex3.)||put(m-1,hex13.),hex16.);
del_gt=n_gt-n;
del_nt=n-n_lt;
format n_: hex16. del: e12.;
run;
Result with SAS 9.4M5 under Windows:
n h n_gt n_lt del_gt del_nt 1 3FF0000000000000 3FF0000000000001 3FEFFFFFFFFFFFFF 2.22045E-16 1.11022E-16 2 4000000000000000 4000000000000001 3FFFFFFFFFFFFFFF 4.44089E-16 2.22045E-16 3 4008000000000000 4008000000000001 4007FFFFFFFFFFFF 4.44089E-16 4.44089E-16 4 4010000000000000 4010000000000001 400FFFFFFFFFFFFF 8.88178E-16 4.44089E-16 5 4014000000000000 4014000000000001 4013FFFFFFFFFFFF 8.88178E-16 8.88178E-16 15 402E000000000000 402E000000000001 402DFFFFFFFFFFFF 1.77636E-15 1.77636E-15 16 4030000000000000 4030000000000001 402FFFFFFFFFFFFF 3.55271E-15 1.77636E-15 17 4031000000000000 4031000000000001 4030FFFFFFFFFFFF 3.55271E-15 3.55271E-15 1023 408FF80000000000 408FF80000000001 408FF7FFFFFFFFFF 1.13687E-13 1.13687E-13 1024 4090000000000000 4090000000000001 408FFFFFFFFFFFFF 2.27374E-13 1.13687E-13 1025 4090040000000000 4090040000000001 409003FFFFFFFFFF 2.27374E-13 2.27374E-13 999999999999 426D1A94A1FFE000 426D1A94A1FFE001 426D1A94A1FFDFFF 1.22070E-04 1.22070E-04 1E12 426D1A94A2000000 426D1A94A2000001 426D1A94A1FFFFFF 1.22070E-04 1.22070E-04
The ending hexadecimal digits in the internal representations of n (shown in h), n_gt and n_lt, "...000", "...001" and "...FFF", resp., indicate that we've really found the "neighboring" numbers of n among the internal representations. Note that the "deltas," i.e., the two differences del_gt=n_gt-n and del_nt=n-n_lt are mostly equal, but not if n is a power of 2 (in which case del_gt=2*del_nt). This is because in the latter case the exponent (cf. variable e in the code) changes in the transition from n to n_lt, as does the place value of the least significant bit. It's this special case that requires the ELSE statement in the code.