Solved: Retain variable output issue

Mushy · Posted 09-28-2021 05:49 PM

The last value of "TOT" in the table "Final" should have been 0, but some -ve number updated:

Any idea what is causing the issue?

data test;
format val 23.19;
input key val;
cards;
1 -0.16
1 0.15
1 0.01
;
run;

data Final;
set test;
retain TOT;
by key ;
if FIRST.key then do; TOT = 0;
end;
TOT = TOT+val;
output;
run;

mkeintz · Posted 09-28-2021 10:46 PM

Or you could round each sum as it is calculated. If the highest precision of the original data is in the hundredths, then round to nearest .01, as in:

data test;
format val 23.19;
input key val;
cards;
1 -0.16
1 0.15
1 0.01
run;

data Final;
   set test;
   retain TOT;
   by key ;
   if FIRST.key then do; TOT = 0;end;

   tot=round(sum(tot,val),.01);
   output;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

PaigeMiller · Posted 09-28-2021 05:59 PM

Machine precision (or machine epsilon) is the issue. Most non-integer numbers cannot be represented exactly in computer binary arithmetic.

https://en.wikipedia.org/wiki/Machine_epsilon

https://go.documentation.sas.com/doc/en/pgmsascdc/v_014/lepg/p0dv87zb3bnse6n1mqo360be70qr.htm

--
Paige Miller

Mushy · Posted 09-28-2021 06:26 PM

Thank you Miller for the information.

Do we have an alternate/solution for this issue for my example?

ballardw · Posted 09-28-2021 06:53 PM

One way that may be a bit fragile:

data Final;
   set test;
   retain TOT;
   by key ;
   if FIRST.key then do; TOT = 0;
   end;
   tempval = val*100;
   TOT = TOT+tempval;
   if last.key then tot=tot/100;
   output;
run;

This is attempting to use integer arithmetic as much as practical. The fragile part comes partially in knowing the minimum multiplier. There is also some behind the scenes rounding going on. Works for this small example.

If you have enough decimal places this still likely run into precision issues though.

mkeintz · Posted 09-28-2021 10:46 PM

Or you could round each sum as it is calculated. If the highest precision of the original data is in the hundredths, then round to nearest .01, as in:

data test;
format val 23.19;
input key val;
cards;
1 -0.16
1 0.15
1 0.01
run;

data Final;
   set test;
   retain TOT;
   by key ;
   if FIRST.key then do; TOT = 0;end;

   tot=round(sum(tot,val),.01);
   output;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

FreelanceReinh · Posted 09-29-2021 08:43 AM

Hello @Mushy,

I mostly use the ROUND function (as mkeintz suggested) in situations like this, often with a very small rounding unit (e.g., 1E-9) -- small enough not to influence results, but large enough to remove the tiny rounding error (!) that would occur otherwise. In your example the absolute rounding error is <1E-17.

Even if you want to switch to integer calculations (as ballardw suggested), you should guard against such tiny rounding errors while performing the conversion to integers, so that you need to round anyway:

data check;
do n=0 to 99;
  val=n/100;
  tempval = val*100; /* no rounding --> non-integers can occur */
  r = round(val*100, 1e-9);
  clean  = (tempval=n);
  cleanr = (r=n);
  output;
end;
run;

proc freq data=check;
tables clean cleanr;
run;

Result (with SAS 9.4M5 under Windows):

                                  Cumulative    Cumulative
clean    Frequency     Percent     Frequency      Percent
----------------------------------------------------------
    0           8        8.00             8         8.00
    1          92       92.00           100       100.00


                                   Cumulative    Cumulative
cleanr    Frequency     Percent     Frequency      Percent
-----------------------------------------------------------
     1         100      100.00           100       100.00

Retain variable output issue

Re: Retain variable output issue

Re: Retain variable output issue

Re: Retain variable output issue

Re: Retain variable output issue

Re: Retain variable output issue

Re: Retain variable output issue

Retain variable output issue

Re: Retain variable output issue

Re: Retain variable output issue

Re: Retain variable output issue

Re: Retain variable output issue

Re: Retain variable output issue

Re: Retain variable output issue

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away