BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
astudent
Calcite | Level 5

Hello,

 

I have a dataset with binary responses for the outcome variable HHFS_bi (0, 1). The RespID is nested in the pant-num. I want to see if there was any change in the HHFS_bi overtime (baseline to follow-up). I have run this code: 

proc glimmix data=home.FS_Binary;
class RespID pant_num group_num time DemAge00;
model HHFS_bi(ref="0")= group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr; 
random RespID(pant_num*group_num) time*pant_num(group_num);
lsmeans group_num*time /oddsratio ilink e;
estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
run;

and got a "did not converge" note in the log message and the results were not giving me the odds ration. Someone recommended adding variable nitemhh for better performance of the model but when I added it the code did not run at all. Could someone help explain what may be missing or needs to be removed from the model for me to get the desired results. Below is the code I used to create the additional suggested variable.

 

/*create binary F.S categories and their cumulative sum*/
data home.FS_Binary (rename=(cum_sum=nitemhh));
set home.final_stacked_outcomes;
if HHFS_Status in ("High","Marginal") then HHFS_bi=0;
if HHFS_Status in ("Low","Very Low") then HHFS_bi=1;
retain cum_sum;/*create the cumulative sum of the binary variable*/
cum_sum+HHFS_bi;
run;
 
Below is the log message with the additional variable for this code: proc glimmix data=home.FS_Binary;
class RespID pant_num group_num time DemAge00;
model HHFS_bi(ref="0")/nitemhh= group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr; /* since we added PartID and site in the random statement, they are considered as random effect and do not need to be included in the model */
random RespID(pant_num*group_num) time*pant_num(group_num);
lsmeans group_num*time /oddsratio ilink e;/*see the position of the levels below*/
estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
run;
astudent_1-1700931981702.png

 

 
Below is the log message for this code:
proc glimmix data=home.FS_Binary;
class RespID pant_num group_num time DemAge00;
model HHFS_bi(ref="0")= group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr; 
random RespID(pant_num*group_num) time*pant_num(group_num);
lsmeans group_num*time /oddsratio ilink e;
estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
run;
astudent_0-1700931652832.png

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

You are running with the default RSPL method, which often has trouble with binary data. Here is my recommendation of how to change your code:

 

 proc glimmix data=home.FS_Binary method=laplace; 
NLOPTIONS maxiter=1000;
class RespID pant_num group_num time DemAge00;
model HHFS_bi(ref="0")=group_num time DemAge00 group_num*time / dist=binary link=logit solution /* ddfm=kr*/ oddsratio; /* since we added PartID and site in the random statement, they are considered as random effect and do not need to be included in the model */
random pant_num/subject=respID;
lsmeans group_num*time /oddsratio ilink e;/*see the position of the levels below*/
estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
run;

This shifts to a maximum likelihood integration method (Laplace) which doesn't allow Kenward-Rogers degrees of freedom. I also suggest rewording the RANDOM statement to get processing by subject. However, based on the sample data, it appears that each pant_num has only a single subject. If that is the case for the full dataset, you'll need to replace pant_num with intercept.

 

SteveDenham

 

View solution in original post

12 REPLIES 12
sbxkoenk
SAS Super FREQ

For the 2nd PROC GLIMMIX (the one with ERROR in red).

You probably want to change

model HHFS_bi(ref="0")/nitemhh= group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr;

into

model HHFS_bi(ref="0") = nitemhh group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr;

BR, Koen

astudent
Calcite | Level 5

Thank you so much. I changed the model statement and the code is running ok however, I still have the convergence issue and I am not getting the odds ratios in my results. Please see below how my results are coming out: 

astudent_0-1701023715810.png

astudent_2-1701023745400.png

astudent_4-1701023808723.png

astudent_5-1701023825366.png

 

 

sbxkoenk
SAS Super FREQ

Hello,

 

You have many columns in the design matrices (especially Z) and not so many observations. 

It's a bit of a mismatch / imbalance , but OK.

 

You have a hierarchical data structure and you have a nested design.
RespID is nested within pant_num and pant_num is nested within group_num, correct?

 

I think you need to get rid of the 1st asterisk (*) in your random statement.
Change

random RespID(pant_num*group_num) time*pant_num(group_num);

into

random RespID(pant_num group_num) time*pant_num(group_num);

and submit again.

BR, Koen

sbxkoenk
SAS Super FREQ

On top of my previous reply (from 10 minutes ago) ...

 

Probably you also need to use SUBJECT= as an option in RANDOM statement.
That will make it much more numerically efficient.

Definitely worth checking out.

 

Understanding the Subject= Effect in SAS Mixed Models Software
SAS Software YouTube channel
https://www.youtube.com/watch?v=pX88W9xViJ8

 

Br, Koen

astudent
Calcite | Level 5

Thank you. I changed that and I still do not have the odds ratios. Please see below the results: 

astudent_0-1701029260212.png

 

astudent_1-1701029279818.png

 

astudent_2-1701029296993.png

 

astudent_3-1701029313919.png

Here is the log message:

 

astudent_4-1701029346486.png

 

sbxkoenk
SAS Super FREQ

In the Covariances Parameter (CovParm) Estimates output object (ODS table)

I can still see the asterisk (*).

 

You probably haven't tried subject= effect yet.

 

For GLIMMIX convergence issues, see another post I have done in this topic thread.

 

Koen

sbxkoenk
SAS Super FREQ

And here are some Model Convergence tips for PROC GLIMMIX (sorry for the blue highlighting, but you can still read it after clicking the image to maximize) :
https://communities.sas.com/t5/Statistical-Procedures/Issues-with-TYPE-in-PROC-GLIMMIX/m-p/800154#M3...

 

Koen

jiltao
SAS Super FREQ

Your RANDOM statemen specification is unusual. That might be causing the nonconvergence issue. Can you send in your sample data? And explain what RESP_ID and PANT_ID are?

Thanks,

Jill

SteveDenham
Jade | Level 19

Looking at your iteration history, I see that it is not moving toward a solution, and your gradient is getting larger with each step. That is a strong indicator that your RANDOM statement is "over-specified" in some sense. Consider doing two things: simplify the RANDOM statement and add an NLOPTIONS maxiter=1000; statement. GLIMMIX has a default maximum of 20 iterations, and for binary responses this is almost always too few.

 

SteveDenham

astudent
Calcite | Level 5

I appreciate all the suggestions I received so far. I tried simplifying the random statement, using the subject in the random statement and adding the NLOPTIONS maxiter=1000 statement. I still have the convergence issues and I did not get the results I am looking for. Here is the sample data:

RespIDpant_numgroup_numtimeDemAge00HHFS_bi
1611120
1611241
2031121
2031231
2122141
2122231
2242141
2242221
2432130
2432241
3042120
3042231

 

RespId is the respondent ID, Pant_num is the pantry number, Group_num: 1=control group, 2=intervention group. I would like to get the odds of people in either group(control and intervention) being in the food secure category after the intervention. if HHFS_bi (0=food insecure and 1=food secure) has changed overtime.

 

This is my code currently: proc glimmix data=home.FS_Binary; NLOPTIONS maxiter=1000;
class RespID pant_num group_num time DemAge00;
model HHFS_bi(ref="0")=group_num time DemAge00 group_num*time / dist=binary link=logit solution ddfm=kr oddsratio; /* since we added PartID and site in the random statement, they are considered as random effect and do not need to be included in the model */
random RespID(pant_num);
lsmeans group_num*time /oddsratio ilink e;/*see the position of the levels below*/
estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
run;

and this is the log message: 


378 proc glimmix data=home.FS_Binary; NLOPTIONS maxiter=1000;
379 class RespID pant_num group_num time DemAge00;
380 model HHFS_bi(ref="0")=group_num time DemAge00 group_num*time / dist=binary link=logit
380! solution ddfm=kr oddsratio; /* since we added PartID and site in the random statement,
380! they are considered as random effect and do not need to be included in the model */
381 random RespID(pant_num);
382 lsmeans group_num*time /oddsratio ilink e;/*see the position of the levels below*/
383 estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
384 run;

 

NOTE: Some observations are not used in the analysis because of: missing response values
(n=60), missing fixed effects (n=2), missing random effects (n=2).
NOTE: The GLIMMIX procedure is modeling the probability that HHFS_bi='1'.
NOTE: Did not converge.
NOTE: PROCEDURE GLIMMIX used (Total process time):
real time 2.38 seconds
cpu time 2.35 seconds

 

SteveDenham
Jade | Level 19

You are running with the default RSPL method, which often has trouble with binary data. Here is my recommendation of how to change your code:

 

 proc glimmix data=home.FS_Binary method=laplace; 
NLOPTIONS maxiter=1000;
class RespID pant_num group_num time DemAge00;
model HHFS_bi(ref="0")=group_num time DemAge00 group_num*time / dist=binary link=logit solution /* ddfm=kr*/ oddsratio; /* since we added PartID and site in the random statement, they are considered as random effect and do not need to be included in the model */
random pant_num/subject=respID;
lsmeans group_num*time /oddsratio ilink e;/*see the position of the levels below*/
estimate 'intervention T2-T1 - control T2-T1' group_num*time 1 -1 -1 1/ exp cl;
run;

This shifts to a maximum likelihood integration method (Laplace) which doesn't allow Kenward-Rogers degrees of freedom. I also suggest rewording the RANDOM statement to get processing by subject. However, based on the sample data, it appears that each pant_num has only a single subject. If that is the case for the full dataset, you'll need to replace pant_num with intercept.

 

SteveDenham

 

astudent
Calcite | Level 5

Thank you very very much. The code worked and I got all the results I was looking for. I appreciate your help a lot.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 1961 views
  • 4 likes
  • 4 in conversation