Programming the statistical procedures from SAS

Beta Regression - NLMIXED

Occasional Contributor
Posts: 7

Beta Regression - NLMIXED

Hi all-

I'd really appreciate your thoughts...

As a reminder, my response variable (y) takes on integers ranging from 1 to

30 (including both 1 and 30). I have one binary predictor (coded 0/1).

The goal is to estimate mean(y) at x=0, mean(y) at x=1, and the ratio

of those two estimates, along with 95% confidence limits.

Here are the steps I took based on very helpful feedback and online searching:

(1) Transformed my data from 1 to 30 to 0 to 1 by employing the

following formula:


(2) Fudged the data very slightly such that the minimum value was

~0.0005 and the maximum value was ~0.999

(3) Employed the following NLMIXED code to fit the Beta Regression:


proc nlmixed data = mydata tech = trureg;

parms beta_xb_0 0 beta_xb_1 0 beta_wd_0 0 beta_wd_1 0;

Xb = beta_xb_0 + beta_xb_1*x;

Wd = beta_wd_0 + beta_wd_1*x;

mu = exp(Xb)/(1 + exp(Xb));

phi = exp(-Wd);

w = mu*phi;

t = (1-mu)*phi;

ll =  lgamma(w+t) - lgamma(w) - lgamma(t) + (w-1)*log(y_transformed) +


estimate 'mean at x=1 on transformed scale'

    exp(beta_xb_0 + beta_xb_1) / (1 + exp(beta_xb_0 + beta_xb_1));

estimate 'mean at x=0 on transformed scale'

    exp(beta_xb_0) / (1 + exp(beta_xb_0));

estimate 'mean at x=1 on original scale'

    29*(exp(beta_xb_0 + beta_xb_1) / (1 + exp(beta_xb_0 + beta_xb_1))) + 1;

estimate 'mean at x=0 on original scale'

    29*(exp(beta_xb_0) / (1 + exp(beta_xb_0))) + 1;

model y_transformed ~ general(ll);



(1) Why are the means on the original scale derived from the ESTIMATE

statements quite different from the hand calculated means on the

original data?

(2) Since the link is seemingly logit, I presume that to obtain

appropriate 95% confidence limits, one should estimate the logit at

x=0 and the logit at x=1 and then apply the inverse logit function to

the estimate as well as upper and lower limits. Then, of course, one

would need to back transform the values (based on "y_transformed") to

the original values ("y")

(3) To obtain 95% confidence limits about the ratio of the two mean

estimates, I also assume that one should estimate:

log((mean|x=0) / (mean|x=1))

and then exponentiate the estimates as well as lower and upper limits,

as one would do to obtain confidence limits about a "relative risk."

Again, one would then need to transform these values back to the

original scale.

Is there something key that I'm missing? I'd really appreciate any

other thoughts or comments on this approach.



Ask a Question
Discussion stats
  • 0 replies
  • 1 in conversation