BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
left
Obsidian | Level 7

I want to subtract/ add the smallest possible value (epsilon) to a natural number (here: 4) saved as numeric with format "best." in order to get a value non-equal to it. I found that constant('maceps') gives me that minimal floating number. 

 

However the SAS internal representation between the natural number and the altered natural number (by + (plus) or - (minus) epsilon) seems to be identical. See the code and its log output below. 

 

Is there an automatic variable that I should use instead that is different to constant('maceps')? 

Or should the natural number be formatted in a different way (float4.)?

 

The solution to this post would be:

  • either adapt the code, so epsilon makes the altered values different to the non-altered value.
  • provide the smallest number that triggers SAS to distinct between altered and non-altered value.
    • If possible being a constant/ automatic variable.
data _find_epsilon;
    epsilon = constant('maceps');           /* Smallest possible floating point value of computer: 2.220446E-16 */
    num_const = 4;
    num_const_gt = 4 + epsilon;             /* Add      epsilon [+] */
    num_const_lt = 4 - epsilon;             /* Subtract epsilon [-] */
    put (num_const num_const_gt num_const_lt epsilon) (=);
    /* Check, if values match. */
    if (num_const = num_const_gt) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
    if (num_const = num_const_lt) then put "EQUAL: num_const (4) = num_const (4 - epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 - epsilon)";
    if ((num_const = num_const_gt) and (num_const = num_const_lt)) then put "EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 + epsilon) OR num_const (4) <--> num_const (4 - epsilon)";
    format num_const best.;
run;

Log-Output (edited for readability):

num_const    = 4 
num_const_gt = 4
num_const_lt = 4
epsilon = 2.220446E-16 EQUAL: num_const (4) = num_const (4 + epsilon) EQUAL: num_const (4) = num_const (4 - epsilon) EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon) NOTE: The data set WORK._FIND_EPSILON has 1 observations and 4 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.02 seconds
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

By adding 4 (which is 2 ** 2) to maceps, you "push" your epsilon out of the available precision by 2 binary digits; you need to correct for this, and adapt your epsilon to the base value:

data _find_epsilon;
    epsilon = 4*constant('maceps');           /* Smallest possible floating point value of computer: 2.220446E-16 */
    num_const = 4;
    num_const_gt = num_const + (epsilon * 2 ** log2(num_const));             /* Add      epsilon [+] */
    num_const_lt = num_const - (epsilon * 2 ** log2(num_const));             /* Subtract epsilon [-] */
    put (num_const num_const_gt num_const_lt epsilon) (=);
    /* Check, if values match. */
    if (num_const = num_const_gt) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
    if (num_const = num_const_lt) then put "EQUAL: num_const (4) = num_const (4 - epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 - epsilon)";
    if ((num_const = num_const_gt) and (num_const = num_const_lt)) then put "EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 + epsilon) OR num_const (4) <--> num_const (4 - epsilon)";
    format num_const best.;
run;

 

View solution in original post

5 REPLIES 5
left
Obsidian | Level 7

If using 4 * epsilon then the values seem to be interpreted as different from SAS. 

 

 

data _get_epsilon;
    epsilon   = constant('maceps');           /* Smallest possible floating point value of computer: 2.220446E-16 */
    num_const = 4;
    /* Add epsilon i-times to see when SAS recognizes a difference. */
    do i=1 by 1 until (num_const ne num_const_gt);
        num_const_gt = num_const + i * epsilon;
        i+1;
    end;
    /* Print values to log (especially "i", the minimal multiplier for epsilon to get a difference). */
    put (num_const num_const_gt epsilon i) (=);;
    /* Verify, that a difference was found by SAS. */
    if (num_const = num_const + i * epsilon) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
    format num_const best.;
run;

Log Output (edited for readability):

 

 

num_const    = 4
num_const_gt = 4 
epsilon      = 2.220446E-16 
i            = 4
NOT EQUAL: num_const (4) <--> (4 + epsilon)
NOTE: The data set WORK._GET_EPSILON has 1 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.00 seconds

 

While this solves the problem in a way I still wonder:

  • How do I know, if "i" is 4 or another value in other cases?
  • What if I ran my code on another machine/ SAS installation? 

 

 

left
Obsidian | Level 7

Just for documentation purposes (and hopefully a help to others): 

I tried to use function floor to round the result "downwards". But in the documentation (see here for floor and floorz respecitvely) for function "floor" this works only, if the argument is within range 1E-12:

  • "Unlike the FLOOR function, the FLOORZ function uses zero fuzzing. If the argument is within 1E-12 of an integer, the FLOOR function fuzzes the result to be equal to that integer. The FLOORZ function does not fuzz the result. Therefore, with the FLOORZ function you might get unexpected results."

My epsilon is smaller (even if multiplied by 4), so this does not get the desired result. 

So instead I used the function "floorz" which worked in my situation. 

Kurt_Bremser
Super User

By adding 4 (which is 2 ** 2) to maceps, you "push" your epsilon out of the available precision by 2 binary digits; you need to correct for this, and adapt your epsilon to the base value:

data _find_epsilon;
    epsilon = 4*constant('maceps');           /* Smallest possible floating point value of computer: 2.220446E-16 */
    num_const = 4;
    num_const_gt = num_const + (epsilon * 2 ** log2(num_const));             /* Add      epsilon [+] */
    num_const_lt = num_const - (epsilon * 2 ** log2(num_const));             /* Subtract epsilon [-] */
    put (num_const num_const_gt num_const_lt epsilon) (=);
    /* Check, if values match. */
    if (num_const = num_const_gt) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
    if (num_const = num_const_lt) then put "EQUAL: num_const (4) = num_const (4 - epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 - epsilon)";
    if ((num_const = num_const_gt) and (num_const = num_const_lt)) then put "EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 + epsilon) OR num_const (4) <--> num_const (4 - epsilon)";
    format num_const best.;
run;

 

ballardw
Super User

The issue isn't really the value of the constant. That is what it is. The issue arises with precision of storage when used in calculations.

When you say the lower limit of machine precision that's one thing. But as soon as you add that 4 you have occupied part of the storage used to hold the value and don't have enough bits for the result.

 

You could try BEST32. but no guarantees for display though. I would force 16 decimals with an F format.

 

Try a value around 4.5E-16 and decrementing in a loop by that constant and see what happens.

 

FreelanceReinh
Jade | Level 19

Hi @left,

 

Sorry for being late to the discussion. To check whether a number in SAS is really the closest possible number less than or greater than a given number, it's best to use a format showing the internal binary floating-point representation, e.g., BINARY64. or HEX16. You can even start with such a formatted value to determine the two numbers in question. Thus you have full control over every bit. In your example (natural numbers) things are somewhat simpler than in the general case, especially if the integers are not too large. For example, integers up to 1E12=1,000,000,000,000 (and also many larger integers) will be processed correctly by the code shown below.

 

/* Create a sample of positive integers n for demonstration */

data have;
do n=1 to 5, 15 to 17, 1023 to 1025, 1e12-1, 1e12;
  output;
end;
run;

/* Determine the closest numbers n_lt < n and n_gt > n */

data want(drop=e m);
set have;
length e $3;
h=put(n,hex16.);
e=h;
m=input(substr(h,4),hex13.);
          n_gt=input(e||put(m+1,hex13.),hex16.);
if m then n_lt=input(e||put(m-1,hex13.),hex16.);
else n_lt=input(put(input(e,hex3.)-1,hex3.)||put(m-1,hex13.),hex16.);
del_gt=n_gt-n;
del_nt=n-n_lt;
format n_: hex16. del: e12.;
run;

Result with SAS 9.4M5 under Windows:

           n           h                  n_gt                n_lt            del_gt         del_nt

           1    3FF0000000000000    3FF0000000000001    3FEFFFFFFFFFFFFF    2.22045E-16    1.11022E-16
           2    4000000000000000    4000000000000001    3FFFFFFFFFFFFFFF    4.44089E-16    2.22045E-16
           3    4008000000000000    4008000000000001    4007FFFFFFFFFFFF    4.44089E-16    4.44089E-16
           4    4010000000000000    4010000000000001    400FFFFFFFFFFFFF    8.88178E-16    4.44089E-16
           5    4014000000000000    4014000000000001    4013FFFFFFFFFFFF    8.88178E-16    8.88178E-16
          15    402E000000000000    402E000000000001    402DFFFFFFFFFFFF    1.77636E-15    1.77636E-15
          16    4030000000000000    4030000000000001    402FFFFFFFFFFFFF    3.55271E-15    1.77636E-15
          17    4031000000000000    4031000000000001    4030FFFFFFFFFFFF    3.55271E-15    3.55271E-15
        1023    408FF80000000000    408FF80000000001    408FF7FFFFFFFFFF    1.13687E-13    1.13687E-13
        1024    4090000000000000    4090000000000001    408FFFFFFFFFFFFF    2.27374E-13    1.13687E-13
        1025    4090040000000000    4090040000000001    409003FFFFFFFFFF    2.27374E-13    2.27374E-13
999999999999    426D1A94A1FFE000    426D1A94A1FFE001    426D1A94A1FFDFFF    1.22070E-04    1.22070E-04
        1E12    426D1A94A2000000    426D1A94A2000001    426D1A94A1FFFFFF    1.22070E-04    1.22070E-04

The ending hexadecimal digits in the internal representations of n (shown in h), n_gt and n_lt, "...000", "...001" and "...FFF", resp., indicate that we've really found the "neighboring" numbers of n among the internal representations. Note that the "deltas," i.e., the two differences del_gt=n_gt-n and del_nt=n-n_lt are mostly equal, but not if n is a power of 2 (in which case del_gt=2*del_nt). This is because in the latter case the exponent (cf. variable e in the code) changes in the transition from n to n_ltas does the place value of the least significant bit. It's this special case that requires the ELSE statement in the code.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1266 views
  • 0 likes
  • 4 in conversation