Solved: Re: proc glimmix, convergence and odds ratio

astudent · Posted 11-25-2023 12:37 PM

Hello,

I have a dataset with binary responses for the outcome variable HHFS_bi (0, 1). The RespID is nested in the pant-num. I want to see if there was any change in the HHFS_bi overtime (baseline to follow-up). I have run this code:

proc glimmix data=home.FS_Binary;

class RespID pant_num group_num time DemAge00;

model HHFS_bi(ref="0")= group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr;

random RespID(pant_num*group_num) time*pant_num(group_num);

lsmeans group_num*time /oddsratio ilink e;

estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;

run;

and got a "did not converge" note in the log message and the results were not giving me the odds ration. Someone recommended adding variable nitemhh for better performance of the model but when I added it the code did not run at all. Could someone help explain what may be missing or needs to be removed from the model for me to get the desired results. Below is the code I used to create the additional suggested variable.

/*create binary F.S categories and their cumulative sum*/

data home.FS_Binary (rename=(cum_sum=nitemhh));

set home.final_stacked_outcomes;

if HHFS_Status in ("High","Marginal") then HHFS_bi=0;

if HHFS_Status in ("Low","Very Low") then HHFS_bi=1;

retain cum_sum;/*create the cumulative sum of the binary variable*/

cum_sum+HHFS_bi;

run;

Below is the log message with the additional variable for this code: proc glimmix data=home.FS_Binary;
class RespID pant_num group_num time DemAge00;
model HHFS_bi(ref="0")/nitemhh= group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr; /* since we added PartID and site in the random statement, they are considered as random effect and do not need to be included in the model */
random RespID(pant_num*group_num) time*pant_num(group_num);
lsmeans group_num*time /oddsratio ilink e;/*see the position of the levels below*/
estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
run;

Below is the log message for this code:

proc glimmix data=home.FS_Binary;

class RespID pant_num group_num time DemAge00;

model HHFS_bi(ref="0")= group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr;

random RespID(pant_num*group_num) time*pant_num(group_num);

lsmeans group_num*time /oddsratio ilink e;

estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;

run;

SteveDenham · Posted 12-06-2023 08:28 AM

You are running with the default RSPL method, which often has trouble with binary data. Here is my recommendation of how to change your code:

 proc glimmix data=home.FS_Binary method=laplace; 
NLOPTIONS maxiter=1000;
class RespID pant_num group_num time DemAge00;
model HHFS_bi(ref="0")=group_num time DemAge00 group_num*time / dist=binary link=logit solution /* ddfm=kr*/ oddsratio; /* since we added PartID and site in the random statement, they are considered as random effect and do not need to be included in the model */
random pant_num/subject=respID;
lsmeans group_num*time /oddsratio ilink e;/*see the position of the levels below*/
estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
run;

This shifts to a maximum likelihood integration method (Laplace) which doesn't allow Kenward-Rogers degrees of freedom. I also suggest rewording the RANDOM statement to get processing by subject. However, based on the sample data, it appears that each pant_num has only a single subject. If that is the case for the full dataset, you'll need to replace pant_num with intercept.

SteveDenham

View solution in original post

sbxkoenk · Posted 11-26-2023 12:53 PM

For the 2nd PROC GLIMMIX (the one with ERROR in red).

You probably want to change

model HHFS_bi(ref="0")/nitemhh= group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr;

into

model HHFS_bi(ref="0") = nitemhh group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr;

BR, Koen

astudent · Posted 11-26-2023 01:37 PM

Thank you so much. I changed the model statement and the code is running ok however, I still have the convergence issue and I am not getting the odds ratios in my results. Please see below how my results are coming out:

sbxkoenk · Posted 11-26-2023 02:52 PM

Hello,

You have many columns in the design matrices (especially Z) and not so many observations.

It's a bit of a mismatch / imbalance , but OK.

You have a hierarchical data structure and you have a nested design.
RespID is nested within pant_num and pant_num is nested within group_num, correct?

I think you need to get rid of the 1st asterisk (*) in your random statement.
Change

random RespID(pant_num*group_num) time*pant_num(group_num);

into

random RespID(pant_num group_num) time*pant_num(group_num);

and submit again.

BR, Koen

sbxkoenk · Posted 11-26-2023 02:57 PM

On top of my previous reply (from 10 minutes ago) ...

Probably you also need to use SUBJECT= as an option in RANDOM statement.
That will make it much more numerically efficient.

Definitely worth checking out.

Understanding the Subject= Effect in SAS Mixed Models Software
SAS Software YouTube channel
https://www.youtube.com/watch?v=pX88W9xViJ8

Br, Koen

astudent · Posted 11-26-2023 03:09 PM

Thank you. I changed that and I still do not have the odds ratios. Please see below the results:

Here is the log message:

sbxkoenk · Posted 11-26-2023 03:40 PM

In the Covariances Parameter (CovParm) Estimates output object (ODS table)

I can still see the asterisk (*).

You probably haven't tried subject= effect yet.

For GLIMMIX convergence issues, see another post I have done in this topic thread.

Koen

sbxkoenk · Posted 11-26-2023 12:59 PM

And here are some Model Convergence tips for PROC GLIMMIX (sorry for the blue highlighting, but you can still read it after clicking the image to maximize) :
https://communities.sas.com/t5/Statistical-Procedures/Issues-with-TYPE-in-PROC-GLIMMIX/m-p/800154#M3...

Koen

jiltao · Posted 11-27-2023 09:41 AM

Your RANDOM statemen specification is unusual. That might be causing the nonconvergence issue. Can you send in your sample data? And explain what RESP_ID and PANT_ID are?

Thanks,

Jill

SteveDenham · Posted 11-29-2023 08:53 AM

Looking at your iteration history, I see that it is not moving toward a solution, and your gradient is getting larger with each step. That is a strong indicator that your RANDOM statement is "over-specified" in some sense. Consider doing two things: simplify the RANDOM statement and add an NLOPTIONS maxiter=1000; statement. GLIMMIX has a default maximum of 20 iterations, and for binary responses this is almost always too few.

SteveDenham

astudent · Posted 12-04-2023 03:21 PM

I appreciate all the suggestions I received so far. I tried simplifying the random statement, using the subject in the random statement and adding the NLOPTIONS maxiter=1000 statement. I still have the convergence issues and I did not get the results I am looking for. Here is the sample data:

RespID	pant_num	group_num	time	DemAge00	HHFS_bi
16	1	1	1	2	0
16	1	1	2	4	1
20	3	1	1	2	1
20	3	1	2	3	1
21	2	2	1	4	1
21	2	2	2	3	1
22	4	2	1	4	1
22	4	2	2	2	1
24	3	2	1	3	0
24	3	2	2	4	1
30	4	2	1	2	0
30	4	2	2	3	1

RespId is the respondent ID, Pant_num is the pantry number, Group_num: 1=control group, 2=intervention group. I would like to get the odds of people in either group(control and intervention) being in the food secure category after the intervention. if HHFS_bi (0=food insecure and 1=food secure) has changed overtime.

This is my code currently: proc glimmix data=home.FS_Binary; NLOPTIONS maxiter=1000;
class RespID pant_num group_num time DemAge00;
model HHFS_bi(ref="0")=group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr oddsratio; /* since we added PartID and site in the random statement, they are considered as random effect and do not need to be included in the model */
random RespID(pant_num);
lsmeans group_num*time /oddsratio ilink e;/*see the position of the levels below*/
estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
run;

and this is the log message:

378 proc glimmix data=home.FS_Binary; NLOPTIONS maxiter=1000;
379 class RespID pant_num group_num time DemAge00;
380 model HHFS_bi(ref="0")=group_num time DemAge00 group_num*time / dist=binary link=logit
380! solution ddfm=kr oddsratio; /* since we added PartID and site in the random statement,
380! they are considered as random effect and do not need to be included in the model */
381 random RespID(pant_num);
382 lsmeans group_num*time /oddsratio ilink e;/*see the position of the levels below*/
383 estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
384 run;

NOTE: Some observations are not used in the analysis because of: missing response values
(n=60), missing fixed effects (n=2), missing random effects (n=2).
NOTE: The GLIMMIX procedure is modeling the probability that HHFS_bi='1'.
NOTE: Did not converge.
NOTE: PROCEDURE GLIMMIX used (Total process time):
real time 2.38 seconds
cpu time 2.35 seconds

SteveDenham · Posted 12-06-2023 08:28 AM

You are running with the default RSPL method, which often has trouble with binary data. Here is my recommendation of how to change your code:

 proc glimmix data=home.FS_Binary method=laplace; 
NLOPTIONS maxiter=1000;
class RespID pant_num group_num time DemAge00;
model HHFS_bi(ref="0")=group_num time DemAge00 group_num*time / dist=binary link=logit solution /* ddfm=kr*/ oddsratio; /* since we added PartID and site in the random statement, they are considered as random effect and do not need to be included in the model */
random pant_num/subject=respID;
lsmeans group_num*time /oddsratio ilink e;/*see the position of the levels below*/
estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
run;

This shifts to a maximum likelihood integration method (Laplace) which doesn't allow Kenward-Rogers degrees of freedom. I also suggest rewording the RANDOM statement to get processing by subject. However, based on the sample data, it appears that each pant_num has only a single subject. If that is the case for the full dataset, you'll need to replace pant_num with intercept.

SteveDenham

astudent · Posted 12-06-2023 03:52 PM

Thank you very very much. The code worked and I got all the results I was looking for. I appreciate your help a lot.

Ready to join fellow brilliant minds for the SAS Hackathon?