BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
stephenpayne
Calcite | Level 5

The following code shows my problem. The betainv() function will not work for a certain range of probabilities. If I run the code through the section below titled "/*works*/", the code runs fine. But if I change the probability from 0.58 to 0.59, it does not work. Excel does the calculation just fine (see pic attached below). And I can't think of any mathematical reason why SAS would be failing.

 

proc iml;

mean  = 0.975;
sdev  = 0.10;
alpha = (mean**2)*(1-mean)/(sdev**2)-mean;
beta  = (alpha)*(1-mean)/mean;

/*works*/
output = betainv(0.58,alpha,beta);

/*does not work*/
output = betainv(0.57,alpha,beta);

quit;

The error i receive is that there's an invalid argument to the function. 

 

Like i said, Excel does it just fine. For a range of probabilities 0.001 to 0.999, this is what i'm expecting SAS to give me.

 

betainv.JPG

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

I'm sorry you are having this problem. I am not sure what is going wrong so I have passed it on to the developer.

 

For the (alpha, beta) parameters in your problem, the beta PDF is highly singular at  x=1. In particular, the PDF at x=1 is undefined.

 The CDF is very, very, steep as x --> 1.  Here is the graph:

 

proc iml;
mean  = 0.975;  sdev  = 0.10;
alpha = (mean**2)*(1-mean)/(sdev**2)-mean;
beta  = (alpha)*(1-mean)/mean;
print alpha beta;

x = do(0, 1, 0.0001);
p = cdf("beta", x, alpha, beta);
title "Graph of CDF Function";
call series(x, p);

I suspect the reason that you can't compute the quantile for a large number such as 0.7 or 0.9 is that the x value for those probabilities is too close to 1. The root-finding algorithm to find the quantile is failing for these extreme parameter values. Let's use the CDF function to compute the probabilities for which the quantile is of the form (1 - epsilon) for epsilon = 10^{-k}, k = 10, 11, ..., 16:

 

g = 10##(-10 : -16);
p = cdf("beta", 1 - g, alpha, beta);
lbls = putn(g, "BEST5.");
print p[L="1 - 10^{-k}" F=8.6 c=lbls];

In standard double-precision arithmetic, the smallest number that you can add/subtract from unity is about 2.2E-16 (this is called "machine epsilon"). You can see that after about p=0.7, the quantile function will not be able to compute x because it will be too close to 1.  In practice, this problem occurs for even smaller values of p because finding the quantile is a root-finding problem which typically involves trying to bracket the root (and thus work very close to 1).

 

Presumably, you want to find a way to work around this issue.

 

1. If you want to graph the quantile function, just reverse the horizontal and vertical axes:

 

title "Graph of Quantile Function";
call series(p, x);

 

 2. If you need to compute quantiles for other reasons, you can exploit a symmetry in the Beta distribution which enables you to always evalute x values that are near 0 instead of x values near 1. The symmetry is 

cdf("beta", x, alpha, beta) = 1 - cdf("beta", 1-x, beta, alpha) 

In terms of the quantile function, the x that satisfies x = QUANTILE("beta", x, alpha, beta) also satisfies

x = 1 - QUANTILE("beta", 1-p, beta, alpha)

In this way you can always evaluate values of x <= 0.5. For example, if you are iterating over a sequence of probability values, you can branch according to whether the probability is less than or greater than 0.5, like this:

p = do(0.01, 0.99, 0.01);
x = j(1, ncol(p), .);
do i = 1 to ncol(p);
   if p[i] <= 0.5 then 
      x[i] = quantile("beta", p[i], alpha, beta);
   else 
      x[i] = 1 - quantile("beta", 1-p[i], beta, alpha);
end;
call series(p, x);

View solution in original post

4 REPLIES 4
ballardw
Super User

You might try using the QUANTILE function instead

quantile('beta',0.57,alpha,beta);

 

There might be an internal precision issue in the betainv function algorithm.

stephenpayne
Calcite | Level 5

Thanks for the quick reply. Sadly, the QUANTILE function does not work for certain probabilities either. For example:

 

proc iml;

mean  = 0.975;
sdev  = 0.10;
alpha = (mean**2)*(1-mean)/(sdev**2)-mean;
beta  = (alpha)*(1-mean)/mean;

invalid = quantile('beta',0.652,alpha,beta);

quit;

/*I found that 0.652 did not work by including this code and checking the error message:*/

output = j(1000,1,.);
do i = 1 to nrow(output) - 1;
	output[i] = quantile('beta',i/nrow(output),alpha,beta);
end;
print output;

quit;

Do you think QUANTILE is suffering from the same internal precision issue? If so, any other thoughts on an alternative method? I need a reliable function because the probability parameter will always be uniform random. I do not control it. 

 

I would be okay with a function equivalent to Excel's =iferror(calc,value if error), but SAS does not have one to my knowledge. The SAS function coalesce() returns the first non-missing value, but in this case it does not work, because it resolves as error and not missing.

 

Thanks again for your help.

ballardw
Super User

Quantile worked for your previous example but not this one so I think there may be an issue, possibly related to the precision in BETA. At least I can show not large changes in the value of beta yield invalid or valid:

I have the first digits of alpha in beta for reference in the code. Then loop beta through a small range around that:

data example;
   mean  = 0.975;
   sdev  = 0.10;
   alpha = (mean**2)*(1-mean)/(sdev**2)-mean;
   beta  = (alpha)*(1-mean)/mean;
   /*alpha=1.4015625 beta=0.0359375*/
   do b = 0.03593 to 0.0360 by .000001;
      invalid = quantile('beta',0.652,alpha,b);
      put b= f7.6 +1 invalid= f17.16;
   end;
run;

I am going to go out on a limb and guess that an exponential somewhere runs out of significant digits.

 

Rick_SAS
SAS Super FREQ

I'm sorry you are having this problem. I am not sure what is going wrong so I have passed it on to the developer.

 

For the (alpha, beta) parameters in your problem, the beta PDF is highly singular at  x=1. In particular, the PDF at x=1 is undefined.

 The CDF is very, very, steep as x --> 1.  Here is the graph:

 

proc iml;
mean  = 0.975;  sdev  = 0.10;
alpha = (mean**2)*(1-mean)/(sdev**2)-mean;
beta  = (alpha)*(1-mean)/mean;
print alpha beta;

x = do(0, 1, 0.0001);
p = cdf("beta", x, alpha, beta);
title "Graph of CDF Function";
call series(x, p);

I suspect the reason that you can't compute the quantile for a large number such as 0.7 or 0.9 is that the x value for those probabilities is too close to 1. The root-finding algorithm to find the quantile is failing for these extreme parameter values. Let's use the CDF function to compute the probabilities for which the quantile is of the form (1 - epsilon) for epsilon = 10^{-k}, k = 10, 11, ..., 16:

 

g = 10##(-10 : -16);
p = cdf("beta", 1 - g, alpha, beta);
lbls = putn(g, "BEST5.");
print p[L="1 - 10^{-k}" F=8.6 c=lbls];

In standard double-precision arithmetic, the smallest number that you can add/subtract from unity is about 2.2E-16 (this is called "machine epsilon"). You can see that after about p=0.7, the quantile function will not be able to compute x because it will be too close to 1.  In practice, this problem occurs for even smaller values of p because finding the quantile is a root-finding problem which typically involves trying to bracket the root (and thus work very close to 1).

 

Presumably, you want to find a way to work around this issue.

 

1. If you want to graph the quantile function, just reverse the horizontal and vertical axes:

 

title "Graph of Quantile Function";
call series(p, x);

 

 2. If you need to compute quantiles for other reasons, you can exploit a symmetry in the Beta distribution which enables you to always evalute x values that are near 0 instead of x values near 1. The symmetry is 

cdf("beta", x, alpha, beta) = 1 - cdf("beta", 1-x, beta, alpha) 

In terms of the quantile function, the x that satisfies x = QUANTILE("beta", x, alpha, beta) also satisfies

x = 1 - QUANTILE("beta", 1-p, beta, alpha)

In this way you can always evaluate values of x <= 0.5. For example, if you are iterating over a sequence of probability values, you can branch according to whether the probability is less than or greater than 0.5, like this:

p = do(0.01, 0.99, 0.01);
x = j(1, ncol(p), .);
do i = 1 to ncol(p);
   if p[i] <= 0.5 then 
      x[i] = quantile("beta", p[i], alpha, beta);
   else 
      x[i] = 1 - quantile("beta", 1-p[i], beta, alpha);
end;
call series(p, x);

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 4 replies
  • 1517 views
  • 2 likes
  • 3 in conversation