Hello,
I have a dataset with binary responses for the outcome variable HHFS_bi (0, 1). The RespID is nested in the pant-num. I want to see if there was any change in the HHFS_bi overtime (baseline to follow-up). I have run this code:
and got a "did not converge" note in the log message and the results were not giving me the odds ration. Someone recommended adding variable nitemhh for better performance of the model but when I added it the code did not run at all. Could someone help explain what may be missing or needs to be removed from the model for me to get the desired results. Below is the code I used to create the additional suggested variable.
You are running with the default RSPL method, which often has trouble with binary data. Here is my recommendation of how to change your code:
proc glimmix data=home.FS_Binary method=laplace;
NLOPTIONS maxiter=1000;
class RespID pant_num group_num time DemAge00;
model HHFS_bi(ref="0")=group_num time DemAge00 group_num*time / dist=binary link=logit solution /* ddfm=kr*/ oddsratio; /* since we added PartID and site in the random statement, they are considered as random effect and do not need to be included in the model */
random pant_num/subject=respID;
lsmeans group_num*time /oddsratio ilink e;/*see the position of the levels below*/
estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
run;
This shifts to a maximum likelihood integration method (Laplace) which doesn't allow Kenward-Rogers degrees of freedom. I also suggest rewording the RANDOM statement to get processing by subject. However, based on the sample data, it appears that each pant_num has only a single subject. If that is the case for the full dataset, you'll need to replace pant_num with intercept.
SteveDenham
For the 2nd PROC GLIMMIX (the one with ERROR in red).
You probably want to change
model HHFS_bi(ref="0")/nitemhh= group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr;
into
model HHFS_bi(ref="0") = nitemhh group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr;
BR, Koen
Thank you so much. I changed the model statement and the code is running ok however, I still have the convergence issue and I am not getting the odds ratios in my results. Please see below how my results are coming out:
Hello,
You have many columns in the design matrices (especially Z) and not so many observations.
It's a bit of a mismatch / imbalance , but OK.
You have a hierarchical data structure and you have a nested design.
RespID is nested within pant_num and pant_num is nested within group_num, correct?
I think you need to get rid of the 1st asterisk (*) in your random statement.
Change
random RespID(pant_num*group_num) time*pant_num(group_num);
into
random RespID(pant_num group_num) time*pant_num(group_num);
and submit again.
BR, Koen
On top of my previous reply (from 10 minutes ago) ...
Probably you also need to use SUBJECT= as an option in RANDOM statement.
That will make it much more numerically efficient.
Definitely worth checking out.
Understanding the Subject= Effect in SAS Mixed Models Software
SAS Software YouTube channel
https://www.youtube.com/watch?v=pX88W9xViJ8
Br, Koen
Thank you. I changed that and I still do not have the odds ratios. Please see below the results:
Here is the log message:
In the Covariances Parameter (CovParm) Estimates output object (ODS table)
I can still see the asterisk (*).
You probably haven't tried subject= effect yet.
For GLIMMIX convergence issues, see another post I have done in this topic thread.
Koen
And here are some Model Convergence tips for PROC GLIMMIX (sorry for the blue highlighting, but you can still read it after clicking the image to maximize) :
https://communities.sas.com/t5/Statistical-Procedures/Issues-with-TYPE-in-PROC-GLIMMIX/m-p/800154#M3...
Koen
Your RANDOM statemen specification is unusual. That might be causing the nonconvergence issue. Can you send in your sample data? And explain what RESP_ID and PANT_ID are?
Thanks,
Jill
Looking at your iteration history, I see that it is not moving toward a solution, and your gradient is getting larger with each step. That is a strong indicator that your RANDOM statement is "over-specified" in some sense. Consider doing two things: simplify the RANDOM statement and add an NLOPTIONS maxiter=1000; statement. GLIMMIX has a default maximum of 20 iterations, and for binary responses this is almost always too few.
SteveDenham
I appreciate all the suggestions I received so far. I tried simplifying the random statement, using the subject in the random statement and adding the NLOPTIONS maxiter=1000 statement. I still have the convergence issues and I did not get the results I am looking for. Here is the sample data:
RespID | pant_num | group_num | time | DemAge00 | HHFS_bi |
16 | 1 | 1 | 1 | 2 | 0 |
16 | 1 | 1 | 2 | 4 | 1 |
20 | 3 | 1 | 1 | 2 | 1 |
20 | 3 | 1 | 2 | 3 | 1 |
21 | 2 | 2 | 1 | 4 | 1 |
21 | 2 | 2 | 2 | 3 | 1 |
22 | 4 | 2 | 1 | 4 | 1 |
22 | 4 | 2 | 2 | 2 | 1 |
24 | 3 | 2 | 1 | 3 | 0 |
24 | 3 | 2 | 2 | 4 | 1 |
30 | 4 | 2 | 1 | 2 | 0 |
30 | 4 | 2 | 2 | 3 | 1 |
RespId is the respondent ID, Pant_num is the pantry number, Group_num: 1=control group, 2=intervention group. I would like to get the odds of people in either group(control and intervention) being in the food secure category after the intervention. if HHFS_bi (0=food insecure and 1=food secure) has changed overtime.
This is my code currently: proc glimmix data=home.FS_Binary; NLOPTIONS maxiter=1000;
class RespID pant_num group_num time DemAge00;
model HHFS_bi(ref="0")=group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr oddsratio; /* since we added PartID and site in the random statement, they are considered as random effect and do not need to be included in the model */
random RespID(pant_num);
lsmeans group_num*time /oddsratio ilink e;/*see the position of the levels below*/
estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
run;
and this is the log message:
378 proc glimmix data=home.FS_Binary; NLOPTIONS maxiter=1000;
379 class RespID pant_num group_num time DemAge00;
380 model HHFS_bi(ref="0")=group_num time DemAge00 group_num*time / dist=binary link=logit
380! solution ddfm=kr oddsratio; /* since we added PartID and site in the random statement,
380! they are considered as random effect and do not need to be included in the model */
381 random RespID(pant_num);
382 lsmeans group_num*time /oddsratio ilink e;/*see the position of the levels below*/
383 estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
384 run;
NOTE: Some observations are not used in the analysis because of: missing response values
(n=60), missing fixed effects (n=2), missing random effects (n=2).
NOTE: The GLIMMIX procedure is modeling the probability that HHFS_bi='1'.
NOTE: Did not converge.
NOTE: PROCEDURE GLIMMIX used (Total process time):
real time 2.38 seconds
cpu time 2.35 seconds
You are running with the default RSPL method, which often has trouble with binary data. Here is my recommendation of how to change your code:
proc glimmix data=home.FS_Binary method=laplace;
NLOPTIONS maxiter=1000;
class RespID pant_num group_num time DemAge00;
model HHFS_bi(ref="0")=group_num time DemAge00 group_num*time / dist=binary link=logit solution /* ddfm=kr*/ oddsratio; /* since we added PartID and site in the random statement, they are considered as random effect and do not need to be included in the model */
random pant_num/subject=respID;
lsmeans group_num*time /oddsratio ilink e;/*see the position of the levels below*/
estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
run;
This shifts to a maximum likelihood integration method (Laplace) which doesn't allow Kenward-Rogers degrees of freedom. I also suggest rewording the RANDOM statement to get processing by subject. However, based on the sample data, it appears that each pant_num has only a single subject. If that is the case for the full dataset, you'll need to replace pant_num with intercept.
SteveDenham
Thank you very very much. The code worked and I got all the results I was looking for. I appreciate your help a lot.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.