BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
TVR
Calcite | Level 5 TVR
Calcite | Level 5

Dear All,

 

For a logistic mixed-effects model, I am trying to visualize the fitted response on top of the original data points. What confuses me is that the fitted curve seems to run much too low as compared to the (eye-balled) mean of the data points. Could anybody advice me on why this may be happening (or simply point out what I am misunderstanding)?

 

The syntax is very simple (as implemnted in v.9.4):

 

proc glimmix data= data1;

class Site;

model Success/Trials = Explanatory /solution link=logit err=binomial;

random Site;

run;

 

I then extract the intercept (a) and coefficient (b) from solutions and plot the fitted curve at the original (ilink) scale as

y=exp(a+b*Explanatory)/(1+(a+b*Explanatory))

Here, I expect the resultant prediction y to apply to the mean value of the random effect, and hence to – on average – hit the middle of the original data points (expressed as the ratio Success/Trials). But as said, when visually examined, the fitted function seems to run much too low at the ilink scale.

 

I am now wondering whether I am misunderstanding something basic about the default parameterization of the random effect in glimmix? Or is there some other reason why this should not work as outlined above?

 

If I am completely off the mark then how would you create the plot described above, with the idea of showing the original data and the fitted function at the original scale, for immediate and intuitive interpretation by the non-expert viewer?

 

With very many thanks for any advice you may offer,

Tomas

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

You can use the OUTPUT statement to compute the predicted values on the inverse link scale.  You can create the predicted values accounting for the random effect, but probably you are asking for the marginal predicted values. 

 

For a binomial response, the observed values are proportions. You can compute that quantity in PROC LMIMIX itself, or in a separate DATA Step. In the following, I compute in in the PROC, then use PROC SGPLOT to overlay a scatter plot of the data and a curve for the predicted values.

 

data data1;
input Site Success Trials Explanatory;
datalines;
 1 20 30 10
 2 10 18  8
 3 15 30  5
 4 20 55  5
 5  5 60  3
11 20 28  9
12 11 16  8
13 14 25  6
14 18 45  4
15  7 54  2
;

proc glimmix data= data1;
class Site;
model Success/Trials = Explanatory /solution link=logit err=binomial;
random Site;
proportion = Success / Trials;   /* create new variable for graphing */
output out=PredOut predicted(noblup ilink)=PredMarg /* marginal mean */
         / symbols;              /* write proportion variable to output data */
run;

proc sort data=PredOut;
by Explanatory;
run;

proc sgplot data=PredOut;
scatter x=Explanatory y=proportion / datalabel=Site; 
series x=Explanatory y=PredMarg;
run;

View solution in original post

4 REPLIES 4
Rick_SAS
SAS Super FREQ

You can use the OUTPUT statement to compute the predicted values on the inverse link scale.  You can create the predicted values accounting for the random effect, but probably you are asking for the marginal predicted values. 

 

For a binomial response, the observed values are proportions. You can compute that quantity in PROC LMIMIX itself, or in a separate DATA Step. In the following, I compute in in the PROC, then use PROC SGPLOT to overlay a scatter plot of the data and a curve for the predicted values.

 

data data1;
input Site Success Trials Explanatory;
datalines;
 1 20 30 10
 2 10 18  8
 3 15 30  5
 4 20 55  5
 5  5 60  3
11 20 28  9
12 11 16  8
13 14 25  6
14 18 45  4
15  7 54  2
;

proc glimmix data= data1;
class Site;
model Success/Trials = Explanatory /solution link=logit err=binomial;
random Site;
proportion = Success / Trials;   /* create new variable for graphing */
output out=PredOut predicted(noblup ilink)=PredMarg /* marginal mean */
         / symbols;              /* write proportion variable to output data */
run;

proc sort data=PredOut;
by Explanatory;
run;

proc sgplot data=PredOut;
scatter x=Explanatory y=proportion / datalabel=Site; 
series x=Explanatory y=PredMarg;
run;
lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

Adding to Rick's valuable response, note that the marginal predicted value corresponds to predicted linear predictor (.e.g, logit) where the random effects are at their theoretical average value (=0). The inverse-link of this predicted value is not the predicted probability averaged over the random effects. This is because the inverse link is a non-linear function.  The inverse-link does correspond to the predicted probability when the random effects take the value of 0. Thus, the predicted p from the inverse-link function is not the mean of the proportions over the sites.

 

Read SAS Global Forum articles by Walt Stroup or his excellent 2015 article in Agronomy Journal (open access) for details on this.

Rick_SAS
SAS Super FREQ

If you want to investigate the alternative that LVM mentions, you can get the predictions that include the random effects with 

predicted(ilink)=Pred

However, you won't get a nice smooth curve in the fit plot.

TVR
Calcite | Level 5 TVR
Calcite | Level 5

Very many thanks for your quick and valuable pointers!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 3564 views
  • 4 likes
  • 3 in conversation