Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Getting CI from a GLMM

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 10-13-2015 01:36 AM
(2337 views)

Hello all,

I ran this logistic model with random effects using a GLMM, and using the coefficients, I have calculated the predicted probabilities for every value of X, which is a continous variable (well, in my data I only had a few values of X, but I am extrapolating). After calculating the predicted probabilities, I have used sgplot to make a sigmoid plot of X in the x-axis and the probabilities in the y axis.

What I couldn't do, is:

Calculate CI's for the probabilities, instead of one sigmoid, I want to have 3, i.e., I want to add one for the lower CI limit and one for the higher CI limit.

Do you know how I can obtain the CI's for the probabilities I have just calculated ?

Thank you !

```
proc glimmix data=data;
class Id;
model Success (event='1') = X / solution dist=BERNOULLI link=logit;
random int / sub=ID;
output out=pred pred=p pred(ILINK )=pp pred(NOBLUP ILINK )=marmean STDERR(NOBLUP ILINK )=se LCL(NOBLUP ILINK )=lcl UCL(NOBLUP ILINK )=Ucl;
run;
data prediction;
do X = 10 to 50;
mu = -5 + 0.15*X;
phat = exp(mu)/(1+exp(mu));
output; end;
run;
```

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You want to use proc plm after running glimmix. Put the Xs in a data set first, run glimmix using the store= command, then run plm to do the scoring. You get predicted and confidence limits. You either print the results or plot them. If you don't use ilink option in plm, you get the logit predictions and confidence limits.

```
data scoreX;
do X = 10 to 50;
output; end;
run;
proc glimmix data=data ;
class Id;
model Success (event='1') = X / solution dist=BERNOULLI link=logit;
random int / sub=ID;
output out=pred pred=p pred(ILINK )=pp pred(NOBLUP ILINK )=marmean STDERR(NOBLUP ILINK )=se LCL(NOBLUP ILINK )=lcl UCL(NOBLUP ILINK )=Ucl;
```**store results**;
run;
proc plm **restore=results**;
score data=scoreX out=predict predicted lclm uclm / ilink;
run;
proc print data=predict;run;
proc sgplot data=predict;
series x=x y=predicted / lineattrs=(color=red thickness=2);
series x=x y=lclm / lineattrs=(color=blue);
series x=x y=uclm / lineattrs=(color=blue);
run;

9 REPLIES 9

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I think I am missing something here. Do you want what I would call the "expected" confidence bounds? That is, those around the probabilities generated in the datastep DATA PREDICTION. Or do you want the "resulting" confidence bounds, which should be in your output dataset PRED?

I suppose the easiest way to the first is to use the known standard error of a probability (=sqrt(p * (1-p) /n) and the CDF function for the normal distribution to calculate the approximate confidence bounds. If it is the second, then fitting a spline through the values (using SGPLOT) should work.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You want to use proc plm after running glimmix. Put the Xs in a data set first, run glimmix using the store= command, then run plm to do the scoring. You get predicted and confidence limits. You either print the results or plot them. If you don't use ilink option in plm, you get the logit predictions and confidence limits.

```
data scoreX;
do X = 10 to 50;
output; end;
run;
proc glimmix data=data ;
class Id;
model Success (event='1') = X / solution dist=BERNOULLI link=logit;
random int / sub=ID;
output out=pred pred=p pred(ILINK )=pp pred(NOBLUP ILINK )=marmean STDERR(NOBLUP ILINK )=se LCL(NOBLUP ILINK )=lcl UCL(NOBLUP ILINK )=Ucl;
```**store results**;
run;
proc plm **restore=results**;
score data=scoreX out=predict predicted lclm uclm / ilink;
run;
proc print data=predict;run;
proc sgplot data=predict;
series x=x y=predicted / lineattrs=(color=red thickness=2);
series x=x y=lclm / lineattrs=(color=blue);
series x=x y=uclm / lineattrs=(color=blue);
run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you both for replying.

Steve, I will try to explain better. I ran the logistic model with only a few values of explanatory variable. In nature it is continous, but my data has only a few possible values of this variable. I wish to use the coefficients to predict the probability of success for every value between 10 to 50. This is done easily like I did in my code, but as for the CI's, SAS will only give me CI's for the predictions of my data, i.e. for the values i have in the data, and since I don't have enough of them, I can't plot them. I do get a nice sigmoid for the predictions.

If I use your formula (a CI for proportion using a normal approximation), do I need to have at least 30 subjects or at least 30 obesrvations (the reason I used GLIMMIX is because every subject had several observations, with different levels of X).

Ivm,

your code ran, I will check to see if the results makes any sense, if so, then I have learned something new, this is the first time I hear of proc plm...I still don't know how it works or what you did, but I do have 3 curves !

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Since you are coming up with "hypothetical" values, based on pre-calculated levels, the N needed is entirely up to you. Check your current work, and use the degrees of freedom for the estimates currently calculated. That should give the most relevant confidence bounds.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Unfortunate, but true. However, you could kluge together the residual variance and ID variance component into total variance and plug that in. Starting to get a bit dodgy, though, pulling the variability from the observed part, and applying it to the "assumed". You *could* add a set of overdispersion factors to the formula I gave to account for the subject to subject variability, and generate a family of confidence intervals around the curve you are generating. That sounds like the law of diminishing returns may be starting to set in, though--too much extra work for too little additional insight.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

PLM will take care of all those tedious issues for the CI.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Larry, I think we are answering different questions here. I agree that PLM is the way to go for the results of the analysis. However, I interpreted part of the question to be--"How do I plot a curve of probabilities and their confidence bounds--independent of any data?" That's what I see in the DATA step. Coming up with the probabilities is the easy part, and will be the same for all studies. Coming up with the confidence bounds is more difficult. My first reply is for the simple random situation, where the variance of a probability is a function of the probability estimate. Later replies address how to accommodate other sources of variability. I was aiming for a general solution that was independent of any estimation procedures or data. Still working on how to get it right, but I will wager I can find it in either Stroup or Zuur's books.

Steve Demja,

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.