Re: Numeric precision and discontinuity

topkatz · Posted 10-03-2008 12:09 AM

Hi.

This isn't specifically on O.R. question, but it has to do with a kind of numerical computing that perhaps you'll have some expertise with.

I'm looking for advice / feedback / abuse concerning a numerical computation issue. Briefly, I'm trying to solve the equation y=f(x) for the value of x, given the function f() and the value y.

I have provided some code below for a function that depends on three parameters, N, M, and r, with a target value of G. I have fixed N = 730 and M = 1, and want to find the value of r that gives G = 730. (The code is designed to look at a range of values of N, with G changing proportionally, but in this example I didn't vary N.) The function has monotonicity properties that guarantee the value of r will fall between 0 and 1; when r = 0, G = 0, and G increases as r increases until G ~ 2**N when r = 1. The solution method is a simple bisection that moves r down if G is above the target, and moves r up if G is below the target, until either the difference between the computed G and the target is within a certain tolerance level, or an upper limit of iterations is reached.

I have run the code below in SAS 9.1.3 on both Windows XP and HP-UX. I get slightly different results on the two systems, but in both cases there is a discontinuity near the solution; as the value of r converges to 0.00929, the value of G flip-flops between 753 and 540, but never seems to get any closer to the target value of 730.

This is a tricky function. It is a sum of about N terms (recall N = 730), and each term involves the product of a very large number (comb(N,k)) with a very small number.

Is there a better way to do this computation to converge on the target value, or is this just too much to ask of SAS?

************************************************************************

%let Nlo = 730 ;
%let Nhi = 730 ;
%let Ninc = 10 ;
%let N0 = 1000 ;
%let G0 = 1000 ;
%let M0 = 1 ;
%let rstart = 0.5 ;
%let gtol = 0.0001 ;
%let ntries = 100 ;
%let bouncelim = 4 ;
%let vnum = 01 ;
%*let vnum = %sysfunc(putn(%sysevalf(&vnum. + 1),z2.)) ;
%put vnum = &vnum. ;
%let dnum = 01 ;
%*let dnum = %sysfunc(putn(%sysevalf(&dnum. + 1),z2.)) ;
%put dnum = &dnum. ;

* trial and error to find r that fits N, M, G for several scaled N, G, with debug info ;
data rbyn_&vnum._d&dnum. ;
keep N M G r rfound i rhi rlo groups gdif rdif mingdif rmingdif gmingdif GBOUNCE ;
array combnk{2:&N0.} ;
M = &M0. ;
do N = &Nlo. to &Nhi. by &Ninc. ;
G = N * %sysevalf(&G0. / &N0.) ;
do k = 2 to N ;
combnk{k} = comb(N,k) ;
end ; * k ;
rfound = 0 ;
r = &rstart. ;
rhi = 1 ;
rlo = 0 ;
pgdif = . ;
mingdif = . ;
rmingdif = . ;
gmingdif = . ;
gbounce = 0 ;
do i = 1 to &ntries. while (rfound = 0) ;
groups = 0 ;
do k = 2 to N ;
qk = (1 - r**k)**M ;
pk = 1 - qk ;
groups_k = combnk{k} * pk ;
groups + groups_k ;
end ; * k ;
gdif = (groups - G) ;
absgdif = abs(gdif) ;
rdif = (rhi - rlo) ;
output ;
if (absgdif < abs(mingdif)) or missing(mingdif) then do ;
if abs(sum(gdif,-mingdif)) < %sysevalf(0.5 * &gtol.) then do ;
gbounce + 1 ;
end ;
else do ;
gbounce = 0 ;
end ;
mingdif = gdif ;
rmingdif = r ;
gmingdif = groups ;
end ;
else if ((absgdif > abs(mingdif)) and (gdif * mingdif > 0)) or (gbounce ge &bouncelim.) then do ; * should be impossible ;
rfound = -1 ;
r = rmingdif ;
groups = gmingdif ;
end ;
if absgdif < &gtol. then do ;
rfound = 1 ;
end ;
else if groups > G then do ;
rhi = r ;
if . < pgdif < 0 then do ;
sumdiff = (gdif - pgdif) ;
r = ((gdif * rlo) - (pgdif * rhi)) / sumdiff ;
end ;
else do ;
r = 0.5 * (rhi + rlo) ;
end ;
end ;
else if groups < G then do ;
rlo = r ;
if pgdif > 0 then do ;
sumdiff = (pgdif - gdif) ;
r = ((pgdif * rlo) - (gdif * rhi)) / sumdiff ;
end ;
else do ;
r = 0.5 * (rlo + rhi) ;
end ;
end ;
pgdif = gdif ;
* output ;
end ; * i ;
output ;
end ; * N ;
run ;

************************************************************************

Thanks!

-- TMK --
"The Macro Klutz"

Cynthia_sas · Posted 10-03-2008 10:04 AM

Hi:
Perhaps these will shed some light on how operating systems can have different precision issues:
http://support.sas.com/documentation/cdl/en/lrcon/59522/HTML/default/a000695157.htm
http://support.sas.com/kb/31/560.html
http://support.sas.com/techsup/technote/ts654.pdf
http://support.sas.com/documentation/cdl/en/connref/59593/HTML/default/implications.htm
http://support.sas.com/kb/20/587.html
http://support.sas.com/kb/30/534.html

cynthia

topkatz · Posted 10-06-2008 10:07 AM

Thank you, Cynthia@sas. I also posted this question on SAS-L and got a couple of interesting responses there (); one of them cited one document that's the same as the first one on your list, and cited another one that's referred to in ts654.

By the way, in related investigations, I've found that the ** exponentiation operator sacrifices some accuracy; even x**1 is not quite the same as x, so if you have x**n and n is an integer, you'll be slower but more accurate if you do x*x*...*x.

-- TMK --
"The Macro Klutz"

Hutch_sas · Posted 10-06-2008 11:09 AM

My guess is that your numerical difficulty is caused by the computation of

1- (1-r**k)**M

because (1-r**k)**M is very close to 1, and the difference gets los in the numerical precision. Perhaps you can expand (1-r**k)**M in a binomial series and cancel out the 1 to get better accuracy?

Hutch_sas · Posted 10-06-2008 12:14 PM

In fact, for your specific case, where M = 1, if you make the following modification to your program to simplify the expression, you converge to an answer in about 24 iterations to r = 0.0090868548:

/*
qk = (1 - r**k)**M ;
pk = 1 - qk ;
*/
pk = r ** k; /* true if M == 1 */

topkatz · Posted 10-07-2008 05:04 PM

Thank you, that was a really useful suggestion. In fact, rearranging 1 - (1 - r**k)**M as r**k * (1 + (1 - r**k) + ... + (1 - r**k)**(M-1)) works better for M > 1, too. It looks like it's better to add a bunch of small numbers than it is to subtract two (relatively) large ones, at least in this case.

Numeric precision and discontinuity