Re: Compute large number in SAS

trungcva112 · Posted 09-04-2018 09:29 AM

Ksharp · Posted 09-04-2018 09:44 AM

@Rick_SAS has any idea ?

Tom · Posted 09-04-2018 09:53 AM

What types of calculations are you trying to do? Can you just change the scale of the numbers?

In addition to the magnitude issue there is limit to the number of significant digits that can be represented.

188  data _null_;
189    length constant $20 value 8 ;
190    do constant='EXACTINT','BIG','SMALL' ;
191      value =constant(constant);
192      put constant value best32.;
193    end;
194  run;

EXACTINT                 9007199254740992
BIG              1.7976931348623E308
SMALL             2.2250738585072E-308

Rick_SAS · Posted 09-04-2018 10:00 AM

When numerical analysts need to deal with large numbers, they use the log scale. For example, you should never compute the likelihood function, which will usually overflow, but always use the log-likelihood function. Similarly, never compute the determinant of an ill-conditioned matrix when the log-determinant is what you need.

SAS has many built-in functions such as LOGPDF that compute the logarithm of standard probability distributions, or LGAMMA, LCOMB, and LFACT, which produce the log of various combinatorial numbers. See my blog post on this topic.

trungcva112 · Posted 09-04-2018 10:47 AM

Hi, I need to compute this equation (with a, d, u, eb, es, b, s are all given). I just need to plug in the number, but SAS could not compute it if (u or b or s) is large

I think this equation can not be transformed further by using LOG or other transformation

Rick_SAS · Posted 09-04-2018 10:53 AM

Please provide the values of the parameters.

trungcva112 · Posted 09-04-2018 10:59 AM

Hi @Rick_SAS. Some typical cases in my sample that SAS can not compute are:

a=0.58 d=0.41 u=344.17 eb=470.89 es=388.29 b=1415 s=104 or

a=0.18 d=0.09 u=983.15 eb=184.69 es=280.79 b=1097 s=760

Rick_SAS · Posted 09-04-2018 11:38 AM

What problem are you trying to solve? Is this a likelihood function? If so, for what distribution? How did you obtain the parameter estimates?

When I see large numbers like these, I always try to ascertain where the numbers came from and what they will be used for. Suppose I tell you that the value 1/pi is 4E187. What does this imply? For example, if pi is a probability, it would mean that the probability is effectively zero. Does that extreme value make sense for your application, or are you expecting a reasonable probability like 0.1?

trungcva112 · Posted 09-04-2018 08:07 PM

Thanks @Rick_SAS. Yes, you are correct, Pi here is probability. And if I know that 1/pi is large. For example, if 1/pi is 4E187, meaning that the probability is effectively zero. Then I could set Pi = 0 (I don't need the extremely low value of Pi).

The issue is that there is no way I can tell whether Pi is extremely low (so that I could comfortably set it to zero) or Pi has a meaningful value unless I can compute the equation above. But whenever u>709, or B or S large enough, SAS can not compute these intermediate values: e^u or (1+u/eb)^b. So I can't tell whether in this case, Pi is extremely low (in this case it has no meaning), or it will take meaningful values like 0,1111.. or 0.234516 ....

So this is the problem. Do you have any idea of how i can get around this issue?

Rick_SAS · Posted 09-04-2018 08:26 PM

Perhaps, but I would appreciate your reponse to the questions in the first paragraph of my previous message. Avner Friedman's fundamental rule of applied mathematics is "never attempt to solve horribly complicated equations if you don't know where they came from." The complexity of the equations is often an indication that there is a better approach to formulating or solving the problem. References would be nice, too. Is this equation something that you developed or did it come from a paper or textbook that we can access?

trungcva112 · Posted 09-05-2018 10:57 AM

Hi @Rick_SAS . The parameters are estimated using maximum likelihood method (EKOP 1996, page 11). I am confident that I get these parameters correctly.

Then these parameters are plug into the equation, which I obtained from a recent paper in a very high quality journal below (in page 28)

http://www.acsu.buffalo.edu/~swhuh/y_daily_prob_Appendix.pdf

The authors mention about how to overcome the overflow issue. I followed them but there are still many cases with large numbers that can not be computed.

So do you know whether there is any method to overcome this (maybe another softwares/machines?)

ChrisNZ · Posted 09-04-2018 10:44 PM

> I would want to know which is the largest (integer) y in (1+x)^y that SAS can compute?

You can do this:

data _null_;
 do X=1 to 10;
  Y=int( log(constant('big')) / log(X+1) );
  put X= Y= ;
  end;
run;

X=1 Y=1024
X=2 Y=646
X=3 Y=512
X=4 Y=441
X=5 Y=396
X=6 Y=364
X=7 Y=341
X=8 Y=323
X=9 Y=308
X=10 Y=296

High-Performance SAS Coding - Third Edition

ChrisNZ · Posted 09-04-2018 10:49 PM

Actually, You can;t use the numbers in my previous reply. This yields useful numbers:

data _null_;
  do X=1 to 10;
    Y=int( log(2**1023.999999999) / log(X+1) );
    put X= Y= ;
  end;
run;

X=1 Y=1023
X=2 Y=646
X=3 Y=511
X=4 Y=441
X=5 Y=396
X=6 Y=364
X=7 Y=341
X=8 Y=323
X=9 Y=308
X=10 Y=296

High-Performance SAS Coding - Third Edition

trungcva112 · Posted 09-06-2018 09:24 AM

Thank you. But anyone knows how to compute these large number or get around this issue.

Rick_SAS · Posted 09-06-2018 10:30 AM

I glanced at the references that you provided. It seems that the paper maximizes the likelihood function (Eqn 16, p 1414) instead of the log-likelihood. Switch the log-likelhood and the overflow will probably disappear,

If you don't take this advice, then other options include:

1. You can do extended precision computations in languages such as Mathematica.

2. To get SAS to compute the answer you would need to rewrite the expression. In the appendix of the paper that you linked to, they do that and reduce the number of times that overflows occur (p. 4, Eqn 17) and discussion therein.

3. Provided that e1 = mu/eps_B and e2 = mu/Eps_S are less than 1, you can estimate those terms by using the binomial expansion of the expressions (1 + e1)^B and (1 + e2)^S. However, the number of terms that you choose to keep depends on the smallness of e1 and e2, which are data dependent, so this idea might not work for your data.

4. Sometimes the parameters become more manageable if you center or standardize the data. For example, if you measure volume in units of "hundred million shares" or share prices in units of "ten dollars," that could make a difference in the magnitude of the parameter estimates.

Good luck!

Registration is open

SAS Training: Just a Click Away