BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DNYabs
Obsidian | Level 7

Hello SAS Community!

Here is daft's question: I have a beta distribution type of data. If I elect to present my results using the "line" separation method, I would use the residual as a statistic for quantitative data that follows a normal distribution, thus" d=gaussian link=identity". Familiar world for me. For binomial distribution, the "Pearson Chi-Square / df" value can be used to show dispersion (close to 1 is acceptable). Any suggestions what I should use to show data dispersion and residual for beta distribution from the output? The output that I have for beta distribution shows intercept and "scale". The trouble is I am not sure what to make of that!

 

Also, what issues do I face if I use this model without the link function? I tried that approach and what I got were different estimates from the ones with link=logit.

Thanks,

DNY

 

Input Loc$ Cultivar$ TRT Rep Inc Sev Index DON FDK Incx Sevx Indexx DONx FDKx Yield TW TKW;
Datalines;
Volga Samson 1 1 36 19.056 6.860 1.2 6 0.360 0.191 0.069 0.012 0.060 43.36 53.96 34.12
…….;
Proc glimmix data = DAL4 method = quadrature plots=boxplot(random marginal conditional observed);
class Loc Rep Cultivar TRT;
model Incx = TRT /solution d=beta link=logit;
random intercept / subject=Rep(Loc);
lsmeans TRT / cl ilink adjust = Tukey lines;
STORE quad1;
run;

proc plm restore = quad1;
lsmeans TRT / ilink diff adj=Tukey e means lines;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

Try adding this LSMEANS statement:

 

lsmeans trt/ilink;

From this result, you would be able to calculate the variance at each level of trt.  To get an estimate at the overall mean, use an LSMESTIMATE statement.  If you have 4 levels for trt, try this:

 

lsmestimate trt 'Grand mean' 1 1 1 1/divisor=4 ilink;

You can adjust this for N levels by include N '1's and divisor=N.  This should avoid the negative estimated mean value.

 

SteveDenham

 

 

View solution in original post

7 REPLIES 7
SteveDenham
Jade | Level 19

I hope I am answering the right question. Variance of a beta variable at the mean can be calculated from the mean and scale as mean*(1-mean)/(1+scale). The residual is simply observed value minus predicted value.

 

I am more concerned that you get different results when specifying link=logit as compared to no link statement, as the canonical link for the beta is the logit.  They ought to be the same.  All I can suggest is something I learned here last week - the data must be the "same", and that includes the order of the observations. So check for any PROC SORT statements prior to your GLIMMIX.

 

SteveDenham

DNYabs
Obsidian | Level 7
 

Hi Steve,

Thanks for always coming through! You were right on the money on PROC SORT. I do have the same estimates now. The initial code generated the Fit Statistics below. My understanding is that the scale parameter (23.5181) is an inverse of the variance of Incx, the response variable in this case. If that is correct, the variance would be:

Var = Exp(23.1581)/(1+Exp(23.1581) =1? I am sure I missed something here because this does not seem correct, does it?.

 

Fit StatsFit Stats

 

If I were to use {Mean*(1-mean)/(1+Scale)}, since there is no mean in the output, I coded mean in the following fashion:

Proc glimmix data = DAL96 method = quadrature plots=all;
class Loc Rep Cultivar TRT;
model Incx = TRT /solution d=beta link=logit;
random intercept / subject=Rep(Loc);
output out=overdisp2 pearson=pearson;
run;
PROC MEANS DATA=overdisp2 mean var;
var pearson;
run;

 

The mean= -0.0197244 and variance = 0.8623935. Somehow, I am not happy with the negative mean unless it is supposed to be exp(-0.0197244)=0.98 and variance becomes 2.37? 

What are your thoughts: Any ideas on a better code for this and for predicted estimates?

Thanks a mill once again,

DNY 

SteveDenham
Jade | Level 19

Try adding this LSMEANS statement:

 

lsmeans trt/ilink;

From this result, you would be able to calculate the variance at each level of trt.  To get an estimate at the overall mean, use an LSMESTIMATE statement.  If you have 4 levels for trt, try this:

 

lsmestimate trt 'Grand mean' 1 1 1 1/divisor=4 ilink;

You can adjust this for N levels by include N '1's and divisor=N.  This should avoid the negative estimated mean value.

 

SteveDenham

 

 

DNYabs
Obsidian | Level 7

Hi Steve,

That worked! Thanks a bunch!

DNY

jiltao
SAS Super FREQ

The scale parameter for different distributions in PROC GLIMMIX can be found in the documentation below --

https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=v_008&docsetId=statug&docsetTarget=stat...

And the scale parameter is NOT the inverse of the variance for the beta distribution.

The parameterizations for a beta distribution, including the variance of Y for a beta distribution, is in the documentation below  -

https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=v_008&docsetId=statug&docsetTarget=stat...

Var(y) = u(1-u) / (1+scale)

DNYabs
Obsidian | Level 7

Thanks for the links JilTao!

I read the link below, which I may have grossly misinterpreted. It said, "The scale parameter (59.4261) is displayed in the Parameter Estimates table and is inversely related to the variance of the response variable." http://support.sas.com/kb/57/480.html

The congruency between you and Steve means my hurdle is out of the way and I thank you both very much!

DNY

jiltao
SAS Super FREQ

Inversely related does not mean it is the inverse. Scale is in the denominator of the variance formula, that is why they are inversely related. But they are not inverse of each other.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 817 views
  • 10 likes
  • 3 in conversation