Solved: SAS issue with sum

hhchenfx · Posted 01-17-2024 07:23 PM

Hi Everyone,

I run the below code to get weight schedule.

The problem is that when I want to get (sum of element)=1, SAS do not return row value such as factor1=0, factor2=0 and factor3=1.

I am not sure if it is my SAS issue or else, thus I keep the count in each part so you can can check.

Can you please help to review?

Thanks,

HHC

data have;
  do factor1 = 0 to 1 by 0.1;
    do factor2 = 0 to 1 by 0.1;
      do factor3 = 0 to 1 by 0.1;
		output;
      end;
    end;
  end;
run;
/*The data set WORK.HAVE has 1331 observations and 3 variables*/

data weight_schedule; set have;
if factor1 + factor2 + factor3 = 1;
run;
/*The data set WORK.WEIGHT_SCHEDULE has 53 observations*/

/********this return CORRECT count*******/

data CORRECT_COUNT; set have;
if 0.99<factor1 + factor2 + factor3<1.01;
run;
/* The data set WORK.W has 66 observations*/

Patrick · Posted 01-17-2024 08:12 PM

There is documentation and long discussions around numeric precision and representation of floating point numbers. You'll find them if searching a bit.

What was surprising to me is that there is such a precision issue with values 0, 0 and 1. This must be related to how SAS populates the iteration variable if you've got a BY 0.1.

If there are no further answers that provide more explanation like that this is already widely known then it would be worth to also raise this directly with SAS Tech Support.

For your case: Round your sum to some non-significant decimal before the comparison.

data have;
  do factor1 = 0 to 1 by 0.1;
    do factor2 = 0 to 1 by 0.1;
      do factor3 = 0 to 1 by 0.1;
        output;
      end;
    end;
  end;
run;

data test;
  set have;
  if round(factor1 + factor2 + factor3,.000001) = 1;
  precision_diff_flg= round(factor1 + factor2 + factor3,.000001) ne (factor1 + factor2 + factor3);
run;

proc print data=test;
run;

View solution in original post

Patrick · Posted 01-17-2024 08:12 PM

There is documentation and long discussions around numeric precision and representation of floating point numbers. You'll find them if searching a bit.

What was surprising to me is that there is such a precision issue with values 0, 0 and 1. This must be related to how SAS populates the iteration variable if you've got a BY 0.1.

If there are no further answers that provide more explanation like that this is already widely known then it would be worth to also raise this directly with SAS Tech Support.

For your case: Round your sum to some non-significant decimal before the comparison.

data have;
  do factor1 = 0 to 1 by 0.1;
    do factor2 = 0 to 1 by 0.1;
      do factor3 = 0 to 1 by 0.1;
        output;
      end;
    end;
  end;
run;

data test;
  set have;
  if round(factor1 + factor2 + factor3,.000001) = 1;
  precision_diff_flg= round(factor1 + factor2 + factor3,.000001) ne (factor1 + factor2 + factor3);
run;

proc print data=test;
run;

SASKiwi · Posted 01-17-2024 08:14 PM

Your are getting a numeric precision problem. It's not a software bug. Try this:

if round(sum(factor1, factor2, factor3), 0.1) = 1;

mkeintz · Posted 01-17-2024 08:19 PM

This is a numeric precision problem, not a SAS problem. It will happen with all computing software using floating point arithmetic (which needs to be the case given that 0.1 is the do loop increment.

For instance, the loop

do factor1=0 to 1 by 0.1;

will NOT produce factor1=1, because at least one of the values 0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9 can not be exactly represented using floating point arithmetic.

Therefore none of the factor combinations that should have two zeroes and a one will ~~not~~ appear. The same will happen with some of the other combinations.

I suggest you use

do factor=0 to 10 by 1;

The program below will then produce 66 observations:

data have;
  do factor1 = 0 to 10 by 1;
    do factor2 = 0 to 10 by 1;
      do factor3 = 0 to 10 by 1;
		output;
      end;
    end;
  end;
run;
/*The data set WORK.HAVE has 1331 observations and 3 variables*/

data weight_schedule; set have;
  if factor1 + factor2 + factor3 = 10;
run;

If you want, you can divide the factors, and their sums by 10 AFTER you do the filtering.

Editted addition:

PS: It's more than just the representation of numbers, it's the sequence of adding.

For instance, 0.1+0.1+0.8 (factor1+factor2+factor3) appears in WEIGHT_SCHEDULE, but 0.1+0.8+0.1 and 0.8+0.1+0.1 do not. I.e. 0.1+0.8 (in either order) does not generate the exact representation of 0.9.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS issue with sum

Re: SAS issue with sum

Re: SAS issue with sum

Re: SAS issue with sum

Re: SAS issue with sum

SAS issue with sum

Re: SAS issue with sum

Re: SAS issue with sum

Re: SAS issue with sum

Re: SAS issue with sum

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away