BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hello, everyone -

I am trying to learn my way around PROC GLIMMIX specifically for the purposes of beta maximum-likelihood estimation. However, I have had some trouble refining my results, and I would be grateful if anyone on this board could help me out.

Beta Regression in PROC GLIMMIX, as outlined in SAS's official procedure documentation here, is based on the seminal paper by Ferrari and Cribari-Neto (which can be accessed here). Refer to page 114 of the SAS document.

In order to establish a rigorous foundation for application of the function, I am trying to reproduce Ferrari's and Cribari-Neto's results with the gasoline dataset (refer to page 12 of the authors' paper). Their results are as follows:



proc glimmix data=db.prater order=formatted exphessian;

class x3;

model y = x3 x4 / dist=beta solution link=logit s;

random _residual_;

nloptions technique=quanew update=bfgs;

run;




I seem to have gotten similar parameter estimates as the authors, with very different standard errors. Also, the authors' scale parameter (phi) is VERY different (=440.27838, whereas the "scale" value given by GLIMMIX is equal to 164.66).



Would anyone familiar with the implementation of beta regression in GLIMMIX and the Cribari/Ferrari framework be able to advise on whether the "scale" parameter is indeed equal to the authors' "phi"? I understand that this discrepancy might be attributable to other factors, but the model is supposed to be simple and the dataset small (32 observations). Therefore, I would expect that any discrepancies between my results and theirs shouldn't be that significant, especially considering that SAS asserts that PROC GLIMMIX's beta regression functionality is derived from the authors' methodology. I would be happy to share my dataset and SAS output with anyone that might be curious to help me with this.



Thanks for taking the time to read my question!
2 REPLIES 2
MattF
Calcite | Level 5
Hi:

Below are some of my code musing around Beta regression with Proc GLIMMIX and NLMIXED. (Pls note I am passing along assistance getting started with this topic I got some time ago from the always helpful folk @ SAS tech support.)

Best regards,
-MattF

/* prater.sas */

/* Matt Flynn */
/* adapted from: */
/*-----------------------------------------------------------*/
/* Kathleen Kiernan */
/* Technical Support Statistician */
/*---Prater's gasoline data. The data appear as data set -*/
/*---number 135 in Hand, et al. (1994), "A Handbook of --*/
/*---Small Data Sets", Chapman & Hall, New York. ---*/
/*--- ---*/
/*---The data are here arranged as in the paper by ---*/
/*---Ferrari and Cribari-Neto (2004) to match the order --*/
/*---of the parameter estimates in their Table 1. ---*/
/*---The id variable groups the covariate patterns ---*/
/*---with respect to x1-x3. ---*/
/*--- ---*/
/*---The response is the gasoline yield percentage in ---*/
/*---crude oil distillation. X1-X3 are properties of ---*/
/*---the distillation. ---*/
/*---The percentage is converted to a proportion in ---*/
/*---the data step. ---*/
/*--- ---*/
/*---Reference: ---*/
/*---Ferrari, S.L.P. and Cribari-Neto, F. (2004), ---*/
/*--"Beta Regression for Modeling Rates and Proportions", */
/*---Journal of Applied Statistics, 31:7:799--815 ---*/
/*-Prater, N.H., 1956. Estimate gasoline yields from crude. -*/
/*-Petroleum Refiner, 35, 236-238 ---*/
/*-----------------------------------------------------------*/
ods pdf file='~/prater.pdf' style=journal;

data gas;
input id x1 x2 x3 x4 yieldpct;
x1a=(x1>=40);
prop = yieldpct/100;
prop2 = log(prop/(1 - prop));
cards;
1 50.8 8.6 190 205 12.2
1 50.8 8.6 190 275 22.3
1 50.8 8.6 190 345 34.7
1 50.8 8.6 190 407 45.7
2 40.8 3.5 210 218 8.0
2 40.8 3.5 210 273 13.1
2 40.8 3.5 210 347 26.6
3 40.0 6.1 217 212 7.4
3 40.0 6.1 217 272 18.2
3 40.0 6.1 217 340 30.4
4 38.4 6.1 220 235 6.9
4 38.4 6.1 220 300 15.2
4 38.4 6.1 220 365 26.0
4 38.4 6.1 220 410 33.6
5 40.3 4.8 231 307 14.4
5 40.3 4.8 231 367 26.8
5 40.3 4.8 231 395 34.9
6 32.2 5.2 236 267 10.0
6 32.2 5.2 236 360 24.8
6 32.2 5.2 236 402 31.7
7 41.3 1.8 267 235 2.8
7 41.3 1.8 267 275 6.4
7 41.3 1.8 267 358 16.1
7 41.3 1.8 267 416 27.8
8 38.1 1.2 274 285 5.0
8 38.1 1.2 274 365 17.6
8 38.1 1.2 274 444 32.1
9 32.2 2.4 284 351 14.0
9 32.2 2.4 284 424 23.2
10 31.8 0.2 316 365 8.5
10 31.8 0.2 316 379 14.7
10 31.8 0.2 316 428 18.0
;
run;

proc print data=gas;
run;

proc means data=gas n min q1 mean median q3 max;
run;

proc means data=gas n mean var;
class x1;
var prop;
run;

/* get some stats, alpha, beta are method of moments estimates,
(:log_alpha, :log_beta can to be used for starting values for MLE )*/
proc sql print;
select min(prop) as min,
mean(prop) as mean,
var(prop) as var,
max(prop) as max,
min(prop) - 0.001*(calculated max - calculated min) as lower,
max(prop) + 0.001*(calculated max - calculated min) as upper
into :min, :mean, :var, :max, :lower, :upper
from gas;
select (mean(prop))*((mean(prop))*(1 - mean(prop))/ (var(prop)) - 1) as alpha,
(1 - mean(prop))*((mean(prop))*(1 - mean(prop))/ (var(prop)) - 1) as beta,
log(calculated alpha) as log_alpha,
log(calculated beta) as log_beta
into :alpha, :beta, :log_alpha, :log_beta
from gas;
quit;
%put *** mean=&mean.;

%let xvar=X1;
proc sort data=gas;
by &xvar.;
run;

/* Levene's test for Homogeneity of variance */
/* Proc GLIMMIX only models mean, mu of beta, check for does need sub-model for variance, */
/* in this case, no, not needed. */

ods graphics on;
proc reg data=gas;
model prop = &xvar.;
model prop2 = &xvar.;
output out=out rstudent=rstudent predicted=predicted;
run;
quit;
ods graphics off;

goptions reset=all device=png;
axis1 label=(angle=90 'Studentized Residual');
symbol1 color=blue value=circle width=2;
proc gplot data=out;
plot rstudent*predicted=1 / vaxis=axis1 vref=0 lvref=2;
run;
quit;

ods select HOVFTest;
ods output HOVFTest=hovftest;
proc anova data=gas;
class &xvar.;
model prop = &xvar.;
means &xvar. / hovtest=Levene(type=square); * also (type=abs);
run;
quit;

symbol1 font=marker value=U color=salmon;
proc boxplot data=gas;
plot prop*&xvar. /
boxstyle = schematic
boxconnect = mean
cboxes = dagr
cboxfill = ywh
cframe = ligr
idsymbol = star
lvref = 3
nohlabel
vref = &mean.
symbollegend = legend1
totpanels = 1
vaxis = 0 to 0.5 by 0.1;
legend1
;
title font=simplex 'Prater Data';
title "Box-Plots of Prop by &xvar.";
run;
title; title2;

ods output Moments=moments;
ods output ParameterEstimates=parms_ab;
ods output FitQuantiles=fitq;
proc univariate data=gas;
var prop ;
histogram /
beta(theta=0.0000001 scale=1.0000001 color=blue w=2)
/* beta(theta=0.002501 scale=0.535364 color=blue w=2)*/
kernel(c=0.5 lower=0.0)
cfill = ywh
cframe = ligr
/* endpoints = &lower. to &upper. by 0.06 */
/* endpoints = 0 to 0.5 by 0.075 */
endpoints = 0 to 1 by 0.1
;
inset Beta(alpha beta theta sigma) / position=ne;
qqplot / beta(alpha=1.87 beta=3.2755 theta=.002501 sigma=.535364);
title 'Prater Gas data';
run;
title;

proc print data=moments;
run;
proc print data=parms_ab;
run;
data parms_ab(keep=Parameter Estimate);
set parms_ab;
if Symbol in ('Alpha','Beta') then Parameter=Symbol;
if Parameter in ('Alpha','Beta','Mean');
run;

proc print data=parms_ab;
run;

*---------------------------------------------------;
/* KDE - kernel density estimation */
*---------------------------------------------------;
ods graphics on;
proc kde data=gas bwm=.45;
univar prop / plots=histdensity levels out=out;
title2 'Smooth Estimate using larger Bandwidth';
run;
ods graphics off;
title2;

/* Beta QQ Plot from Proc Univariate */
proc print data=fitq;
run;
data fitq2;
set fitq(keep=ObsQuantile rename=ObsQuantile=Quantile);
dist='Observed';
n=_N_;
output;
set fitq(keep=EstQuantile rename=EstQuantile=Quantile);
dist='Beta';
n=_N_;
output;
run;

proc sort data=fitq2;
by descending dist;
run;

proc print data=fitq2;
run;

symbol1 l=1 value=square color=blue font=;
symbol3 l=2 value=circle color=green font=;
proc gplot data=fitq2;
plot n * Quantile = dist;
run;
quit;

*---------------------------------------------------;
/* KS - Kolmogorov-Smirnov - GOF test */
*---------------------------------------------------;
ods output KolSmir2Stats=k;
proc npar1way data=fitq2 wilcoxon median edf;
class dist;
var Quantile;
run;
proc print data=k;
run;
title;

%put log_alpha=&log_alpha., log_beta=&log_beta.;

/***************************************************************************/
/* fit beta regression model via GLIMMIX */
/***************************************************************************/
ods graphics on;
*ods select LSMeans DiffPlot;
ods output ParameterEstimates=parms;
proc glimmix data=gas or plots=all;*noprofile plots=pearsonpanel;
* model prop = / dist=beta s;
model prop = x1 x2 x3 x4 / dist=beta solution;
*output out=out predicted(ilink)=PredMu residual(ilink)=ResidMu;
output out=out / allstats;
*random _residual_;
run;
ods graphics off;

data parms;
set parms;
if Effect='Intercept' then mu = 1/(1 + exp(-estimate));
if substr(Effect,1,1)='x' then odds_ratio=exp(estimate);
run;

proc print data=parms;
format Estimate 10.6;
run;

/*proc print data=out;*/
run;
data parms;
set parms(keep=Effect Estimate);
if Effect='Intercept' then do;
mu = 1/(1 + exp(-estimate));
Estimate1=Estimate;
retain Estimate1 mu;
end;
if effect='Scale' then do;
oneoverphi=1/estimate;
alpha=mu*estimate;
beta=(1 - mu)*estimate;
Estimate2=Estimate;
end;
rename Estimate=Scale;
run;

proc print data=parms(firstobs=2);
var Estimate1 Estimate2 mu Scale alpha beta one:;
run;

/***********************************************************************************************/
/* fit beta model via NLMIXED */
/* http://mathworld.wolfram.com/BetaDistribution.html */
/* http://functions.wolfram.com/GammaBetaErf/ */

/* using starting values supplied from Proc UNIVARIATE; alpha, beta parameterization
f(y) = (gamma(alpha + beta)/((gamma(alpha)*gamma(beta)))*
((y)**(alpha - 1))*((1 - y)**(beta - 1))) alpha>0, beta>0 for 0<1, Both
alpha and beta are shape parameters, alpha pulling density to zero, beta pulling towards 1 */
/* Smithson, Michael & Jay Verkuilen */
/* Beta Regression: Practical Issues in Estimation, wp, Jan-05 */
/* http://www.anu.edu.au/psychology/people/smithson/details/betareg/Readme.pdf */
/* recommends ...tech=quanew update=bfgs; with tech=truereg for hard problems */
/***********************************************************************************************/
proc nlmixed data=gas;* tech=quanew update=bfgs;
* parms / data=parms_ab(where=(Parameter in ('Alpha','Beta'))); * using starting values estimated above from Proc UNIVARIATE;
parms ba_0=&log_alpha. bb_0=&log_beta.; * using starting values estimated above from Proc SQL;
alpha = exp(ba_0);
beta = exp(bb_0);
y = prop;
* beta=1; * power-function distribution;
if alpha > 1 and beta > 1 then mode = (alpha - 1)/(alpha + beta - 2);
else if alpha < 1 and beta < 1 then mode = '0 and 1';
else if alpha < 1 and beta >= 1 then mode = 0;
else if alpha = 1 and beta > 1 then mode = 0;
else if alpha >= 1 and beta < 1 then mode = 1;
else if alpha > 1 and beta = 1 then mode = 1;
else if alpha = beta = 1 then mode = 'does not exist';
/* four ways - el mismo - all same */
prob = (gamma(alpha + beta)/((gamma(alpha)*gamma(beta)))*
((y)**(alpha - 1))*((1 - y)**(beta - 1)));
loglike = log(prob);

* loglike = lgamma(alpha + beta) - (lgamma(alpha) + lgamma(beta)) +
(alpha - 1)*log(y) + (beta - 1)*log(1 - y);

* loglike = log(pdf('BETA', y, alpha, beta));

* loglike = logpdf('BETA', y, alpha, beta);

model y ~ general(loglike);
estimate 'alpha' exp(ba_0);
estimate 'beta' exp(bb_0);
estimate 'E(prop)' alpha/(alpha + beta);
estimate 'Var(prop)' alpha*beta/(((alpha + beta)**2)*(alpha + beta + 1));
estimate 'Var(prop)' alpha/(alpha + beta)*(1 - (alpha/(alpha + beta)))*(1/(alpha + beta + 1));
estimate 'Var(prop)' alpha*beta/((alpha + beta + 1)*(alpha + beta)**2);
* estimate 'Var(prop)' (1/((alpha + beta + 1))*mu*(1 - mu));
estimate 'Var(prop)' (1/((alpha + beta + 1))*(alpha/(alpha + beta))*(1-(alpha/(alpha + beta))));
estimate 'mode' mode;
estimate 'precision phi' 1/(alpha + beta + 1);
estimate 'dispersion 1/phi' (alpha + beta + 1);

title 'Beta model via Proc NLMIXED - Alpha,Beta Parameterization';
run;
title;

/******************************************************************************/
/* from betareg package in R */
/* Alexandre de Bustamante Simas */
/* Version 1.1, June 27, 2005 */
/* Version 2.23, Apr 4, 2010 */
/* Achim Zeileis, Francisco Cribari-Neto */
/* http://cran.r-project.org/ */
/*
(Intercept) V1 V2 V3 V4 phi
-2.694897 0.004542 0.030415 -0.011044 0.010564 246.510503
*******************************************************************************/

/******************************************************************************************/
/* fit beta model via NLMIXED (mu, phi) parameterization */
/* Paolino, Phillip;
Maximum likelihood estimation of models with Beta-distributed dependent variables;
Political Analysis, 2001, v9, n4, p325-346,
http://polmeth.wustl.edu/polanalysis/vol9no4.html */
/******************************************************************************************/
proc print data=gas(obs=5); run;
ods output ParameterEstimates=parms;
proc nlmixed data=gas;* cov ecov;
* bounds 0 < mu < 1, phi > 0;
mu = exp(eta)/(1 + exp(eta)); * logit inverse link function; * logit link: log(mu/(1 - mu);
* mu = 1/2 + atan(eta)/constant('pi'); * Cauchy cauchit link tan[constant('pi')*(mu – .5)];
* mu = 1 - exp(-exp(1-eta)); * loglog link;
* mu = (1 - exp(-exp(eta))); * cloglog link: log(-log(1 - mu));
* mu = probnorm(eta); * probit link: probit(mu);
y = prop;
loglike = lgamma(phi) - (lgamma(mu*phi) + lgamma((1 - mu)*phi)) +
(mu*phi - 1)*log(y) + ((1 - mu)*phi - 1)*log(1 - y);
model y ~ general(loglike);
estimate 'mu' mu;
estimate 'precision phi' phi;
estimate 'dispersion 1/phi' 1/phi;
estimate 'var' mu*(1 - mu)/(1 + phi);
estimate 'alpha' mu*phi;
estimate 'beta' (1 - mu)*phi;
/* estimate 'var' alpha*beta/(((alpha + beta)**2)*(alpha + beta + 1));*/
estimate 'Var' (mu*phi)*((1 - mu)*phi)/((((mu*phi) + ((1 - mu)*phi))**2)*((mu*phi) + ((1 - mu)*phi) + 1));
estimate 'Var' (1/((mu*phi + (1 - mu)*phi + 1))*mu*(1 - mu));
estimate 'mode' ((mu*phi)-1)/((mu*phi)+ ((1 - mu)*phi) - 2);
estimate 'p = mu*phi' mu*phi; * mu = p/(p + q);
estimate 'q = phi - mu*phi' phi - mu*phi; * phi = p + q;
title 'Beta model via Proc NLMIXED (mu, phi) parameterization';
run;
title;

proc nlmixed data=gas;* cov ecov;
parms b0=-3 b1=0 b2=0. b3=0 b4=0 phi=246.51;
eta = b0 + b1*x1 + b2*x2 + b3*x3 + b4*x4;
mu = exp(eta)/(1 + exp(eta)); * logit inverse link function;
* eta_phi = p0;* + p2*x2; * can be gerneralized with parameters in the variance;
* see: Paolino (1999), also Smithson & Verkuilen (2005);
* phi = exp(eta_phi); * log inverse link function;
* eta = b0;
* mu = exp(eta); * log inverse link function;
y = prop;
* loglike = lgamma(phi) - (lgamma(mu*phi) + lgamma((1 - mu)*phi)) +
(mu*phi - 1)*log(y) + ((1 - mu)*phi - 1)*log(1 - y);

/* Smithson & Verkuilen (2005), "A Better Lemon Squeezer"*/
/* http://www.anu.edu.au/psychology/people/smithson/details/betareg/SAS_beta_regression.sas */
/* keep original alpha, beta formulation form of Beta density, tranlate mu, phi -> p & q */
p = mu*phi; * mu = p/(p + q);
q = phi - mu*phi; * phi = p + q;
loglike = lgamma(p + q) - lgamma(p) - lgamma(q) + (p - 1)*log(y) + (q - 1)*log(1 - y);

model y ~ general(loglike);
estimate 'mu' mu;
estimate '1/phi' 1/phi;
estimate 'var' mu*(1 - mu)/(1 + phi);
estimate 'alpha' mu*phi;
estimate 'beta' (1 - mu)*phi;
/* estimate 'var' alpha*beta/(((alpha + beta)**2)*(alpha + beta + 1));*/
estimate 'Var' (mu*phi)*((1 - mu)*phi)/((((mu*phi) + ((1 - mu)*phi))**2)*((mu*phi) + ((1 - mu)*phi) + 1));
estimate 'Var' (1/((mu*phi + (1 - mu)*phi + 1))*mu*(1 - mu));
estimate 'mode' ((mu*phi)-1)/((mu*phi)+ ((1 - mu)*phi) - 2);
estimate 'p' p;
estimate 'q' q;
/* estimate 'odds ratio - x1' exp(b1); */
/* estimate 'odds ratio - x2' exp(b2); */
/* estimate 'odds ratio - x3' exp(b3); */
/* estimate 'odds ratio - x4' exp(b4); */
title 'Beta model via Proc NLMIXED (mu, phi) parameterization';
run;
title;

proc print data=parms double;
format estimate 12.6;
run;

data parms_ab;
set parms(keep=Parameter Estimate);
if Parameter='b0' then do;
mu = 1/(1 + exp(-Estimate));
retain mu;
end;
if Parameter='phi' then do;
oneoverphi=1/Estimate;
alpha=mu*Estimate;
beta=(1 - mu)*Estimate;
end;
run;

proc print data=parms_ab(firstobs=2);
*&var mu estimate alpha beta one:;
run;

/*
From Proc UNIVARIATE Online docs
The beta distributions are also referred to as Pearson Type I or II distributions.
These include the power-function distribution (\beta=1), the arc-sine distribution
(\alpha =\beta =\frac{1}2), and the generalized arc-sine distributions (\alpha +\beta =1,
\beta \neq \frac{1}2).
*/

ods pdf close;
deleted_user
Not applicable
THANK YOU, Matt! That's exactly what I was looking for.

Turns out that part of the problem was my dataset - I think one or two observations might have been off. Otherwise, the specific code, and the alternatives, work perfectly!

Thanks again - the code you provided matches Cribari-Neto's and Ferrari's results exactly. : )

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 9661 views
  • 0 likes
  • 2 in conversation