SAS Programming

left · Posted 01-14-2021 04:37 AM

I want to subtract/ add the smallest possible value (epsilon) to a natural number (here: 4) saved as numeric with format "best." in order to get a value non-equal to it. I found that constant('maceps') gives me that minimal floating number.

However the SAS internal representation between the natural number and the altered natural number (by + (plus) or - (minus) epsilon) seems to be identical. See the code and its log output below.

Is there an automatic variable that I should use instead that is different to constant('maceps')?

Or should the natural number be formatted in a different way (float4.)?

The solution to this post would be:

either adapt the code, so epsilon makes the altered values different to the non-altered value.
provide the smallest number that triggers SAS to distinct between altered and non-altered value.
- If possible being a constant/ automatic variable.

data _find_epsilon;
    epsilon = constant('maceps');           /* Smallest possible floating point value of computer: 2.220446E-16 */
    num_const = 4;
    num_const_gt = 4 + epsilon;             /* Add      epsilon [+] */
    num_const_lt = 4 - epsilon;             /* Subtract epsilon [-] */
    put (num_const num_const_gt num_const_lt epsilon) (=);
    /* Check, if values match. */
    if (num_const = num_const_gt) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
    if (num_const = num_const_lt) then put "EQUAL: num_const (4) = num_const (4 - epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 - epsilon)";
    if ((num_const = num_const_gt) and (num_const = num_const_lt)) then put "EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 + epsilon) OR num_const (4) <--> num_const (4 - epsilon)";
    format num_const best.;
run;

Log-Output (edited for readability):

num_const    = 4 
num_const_gt = 4 
num_const_lt = 4 
epsilon      = 2.220446E-16
EQUAL: num_const (4) = num_const (4 + epsilon)
EQUAL: num_const (4) = num_const (4 - epsilon)
EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon)
NOTE: The data set WORK._FIND_EPSILON has 1 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.02 seconds

Kurt_Bremser · Posted 01-14-2021 05:17 AM

By adding 4 (which is 2 ** 2) to maceps, you "push" your epsilon out of the available precision by 2 binary digits; you need to correct for this, and adapt your epsilon to the base value:

data _find_epsilon;
    epsilon = 4*constant('maceps');           /* Smallest possible floating point value of computer: 2.220446E-16 */
    num_const = 4;
    num_const_gt = num_const + (epsilon * 2 ** log2(num_const));             /* Add      epsilon [+] */
    num_const_lt = num_const - (epsilon * 2 ** log2(num_const));             /* Subtract epsilon [-] */
    put (num_const num_const_gt num_const_lt epsilon) (=);
    /* Check, if values match. */
    if (num_const = num_const_gt) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
    if (num_const = num_const_lt) then put "EQUAL: num_const (4) = num_const (4 - epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 - epsilon)";
    if ((num_const = num_const_gt) and (num_const = num_const_lt)) then put "EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 + epsilon) OR num_const (4) <--> num_const (4 - epsilon)";
    format num_const best.;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

left · Posted 01-14-2021 05:17 AM

If using 4 * epsilon then the values seem to be interpreted as different from SAS.

data _get_epsilon;
    epsilon   = constant('maceps');           /* Smallest possible floating point value of computer: 2.220446E-16 */
    num_const = 4;
    /* Add epsilon i-times to see when SAS recognizes a difference. */
    do i=1 by 1 until (num_const ne num_const_gt);
        num_const_gt = num_const + i * epsilon;
        i+1;
    end;
    /* Print values to log (especially "i", the minimal multiplier for epsilon to get a difference). */
    put (num_const num_const_gt epsilon i) (=);;
    /* Verify, that a difference was found by SAS. */
    if (num_const = num_const + i * epsilon) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
    format num_const best.;
run;

Log Output (edited for readability):

num_const    = 4
num_const_gt = 4 
epsilon      = 2.220446E-16 
i            = 4
NOT EQUAL: num_const (4) <--> (4 + epsilon)
NOTE: The data set WORK._GET_EPSILON has 1 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.00 seconds

While this solves the problem in a way I still wonder:

How do I know, if "i" is 4 or another value in other cases?
What if I ran my code on another machine/ SAS installation?

left · Posted 01-14-2021 05:43 AM

Just for documentation purposes (and hopefully a help to others):

I tried to use function floor to round the result "downwards". But in the documentation (see here for floor and floorz respecitvely) for function "floor" this works only, if the argument is within range 1E-12:

"Unlike the FLOOR function, the FLOORZ function uses zero fuzzing. If the argument is within 1E-12 of an integer, the FLOOR function fuzzes the result to be equal to that integer. The FLOORZ function does not fuzz the result. Therefore, with the FLOORZ function you might get unexpected results."

My epsilon is smaller (even if multiplied by 4), so this does not get the desired result.

So instead I used the function "floorz" which worked in my situation.

Kurt_Bremser · Posted 01-14-2021 05:17 AM

By adding 4 (which is 2 ** 2) to maceps, you "push" your epsilon out of the available precision by 2 binary digits; you need to correct for this, and adapt your epsilon to the base value:

data _find_epsilon;
    epsilon = 4*constant('maceps');           /* Smallest possible floating point value of computer: 2.220446E-16 */
    num_const = 4;
    num_const_gt = num_const + (epsilon * 2 ** log2(num_const));             /* Add      epsilon [+] */
    num_const_lt = num_const - (epsilon * 2 ** log2(num_const));             /* Subtract epsilon [-] */
    put (num_const num_const_gt num_const_lt epsilon) (=);
    /* Check, if values match. */
    if (num_const = num_const_gt) then put "EQUAL: num_const (4) = num_const (4 + epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 + epsilon)";
    if (num_const = num_const_lt) then put "EQUAL: num_const (4) = num_const (4 - epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 - epsilon)";
    if ((num_const = num_const_gt) and (num_const = num_const_lt)) then put "EQUAL: num_const (4) = num_const (4 + epsilon) AND num_const (4) = num_const (4 - epsilon)";
    else put "NOT EQUAL: num_const (4) <--> (4 + epsilon) OR num_const (4) <--> num_const (4 - epsilon)";
    format num_const best.;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

ballardw · Posted 01-14-2021 05:17 AM

The issue isn't really the value of the constant. That is what it is. The issue arises with precision of storage when used in calculations.

When you say the lower limit of machine precision that's one thing. But as soon as you add that 4 you have occupied part of the storage used to hold the value and don't have enough bits for the result.

You could try BEST32. but no guarantees for display though. I would force 16 decimals with an F format.

Try a value around 4.5E-16 and decrementing in a loop by that constant and see what happens.

FreelanceReinh · Posted 01-14-2021 12:57 PM

Hi @left,

Sorry for being late to the discussion. To check whether a number in SAS is really the closest possible number less than or greater than a given number, it's best to use a format showing the internal binary floating-point representation, e.g., BINARY64. or HEX16. You can even start with such a formatted value to determine the two numbers in question. Thus you have full control over every bit. In your example (natural numbers) things are somewhat simpler than in the general case, especially if the integers are not too large. For example, integers up to 1E12=1,000,000,000,000 (and also many larger integers) will be processed correctly by the code shown below.

/* Create a sample of positive integers n for demonstration */

data have;
do n=1 to 5, 15 to 17, 1023 to 1025, 1e12-1, 1e12;
  output;
end;
run;

/* Determine the closest numbers n_lt < n and n_gt > n */

data want(drop=e m);
set have;
length e $3;
h=put(n,hex16.);
e=h;
m=input(substr(h,4),hex13.);
          n_gt=input(e||put(m+1,hex13.),hex16.);
if m then n_lt=input(e||put(m-1,hex13.),hex16.);
else n_lt=input(put(input(e,hex3.)-1,hex3.)||put(m-1,hex13.),hex16.);
del_gt=n_gt-n;
del_nt=n-n_lt;
format n_: hex16. del: e12.;
run;

Result with SAS 9.4M5 under Windows:

           n           h                  n_gt                n_lt            del_gt         del_nt

           1    3FF0000000000000    3FF0000000000001    3FEFFFFFFFFFFFFF    2.22045E-16    1.11022E-16
           2    4000000000000000    4000000000000001    3FFFFFFFFFFFFFFF    4.44089E-16    2.22045E-16
           3    4008000000000000    4008000000000001    4007FFFFFFFFFFFF    4.44089E-16    4.44089E-16
           4    4010000000000000    4010000000000001    400FFFFFFFFFFFFF    8.88178E-16    4.44089E-16
           5    4014000000000000    4014000000000001    4013FFFFFFFFFFFF    8.88178E-16    8.88178E-16
          15    402E000000000000    402E000000000001    402DFFFFFFFFFFFF    1.77636E-15    1.77636E-15
          16    4030000000000000    4030000000000001    402FFFFFFFFFFFFF    3.55271E-15    1.77636E-15
          17    4031000000000000    4031000000000001    4030FFFFFFFFFFFF    3.55271E-15    3.55271E-15
        1023    408FF80000000000    408FF80000000001    408FF7FFFFFFFFFF    1.13687E-13    1.13687E-13
        1024    4090000000000000    4090000000000001    408FFFFFFFFFFFFF    2.27374E-13    1.13687E-13
        1025    4090040000000000    4090040000000001    409003FFFFFFFFFF    2.27374E-13    2.27374E-13
999999999999    426D1A94A1FFE000    426D1A94A1FFE001    426D1A94A1FFDFFF    1.22070E-04    1.22070E-04
        1E12    426D1A94A2000000    426D1A94A2000001    426D1A94A1FFFFFF    1.22070E-04    1.22070E-04

The ending hexadecimal digits in the internal representations of n (shown in h), n_gt and n_lt, "...000", "...001" and "...FFF", resp., indicate that we've really found the "neighboring" numbers of n among the internal representations. Note that the "deltas," i.e., the two differences del_gt=n_gt-n and del_nt=n-n_lt are mostly equal, but not if n is a power of 2 (in which case del_gt=2*del_nt). This is because in the latter case the exponent (cf. variable e in the code) changes in the transition from n to n_lt, as does the place value of the least significant bit. It's this special case that requires the ELSE statement in the code.

SAS Programming

Find epsilon/ smallest possible value [constant('maceps') is NOT the solution]

Re: Find epsilon/ smallest possible value [constant('maceps') is NOT the solution]

Re: Find epsilon/ smallest possible value [constant('maceps') is NOT the solution]

Re: Find epsilon/ smallest possible value [constant('maceps') is NOT the solution]

Re: Find epsilon/ smallest possible value [constant('maceps') is NOT the solution]

Re: Find epsilon/ smallest possible value [constant('maceps') is NOT the solution]

Re: Find epsilon/ smallest possible value [constant('maceps') is NOT the solution]

WaHe - Waste Heat Solution

Unlocking Business Value From Improving SAS Fraud Solutions

ModelOps, MLOps, and Finding Value in Analytics

Food Insecurity: Solutions Beyond Food

Management Solutions - Enel

Follow Us

What is...

SAS Programming

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...