BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Lao_feng
Obsidian | Level 7
 

I am doing a logistic regression analysis with random intercept in the model to account for within cluster correlation.There are 3 levels in my data:

Level 1: individual subject

Level 2: Family (some subjects from the same family), the variable is fam_num

Level 3: Village, the variable clu_num

 

The model included country, age, gender, education and marrital status.

 

My codes are:

 

proc glimmix data=work0 NOCLPRINT;
class  country (ref='Pakistan') clu_num  fam_num age4g(ref='40~49') gender edu2g mar2g ;
model comb2g(ref='0')= country age4g gender edu2g mar2g /solution Link=logit dist=binary
random int /sub=clu_num;
random int /sub=fam_num (clu_num) ;
COVTEST GLM;
run;

 

I used 'COVTEST GLM' to check if the outcome is independent or not within clusters. The results I got are as follow:

The P value is 0.4115, suggesting the clustering effects are not statistically significant. In such case, can I remove the random effect from model, and use standard logistic regression? Thanks

 

Covariance Parameter Estimates

Cov Parm

Subject

Estimate

Standard
Error

Intercept

clu_num

0.01724

0.02710

Intercept

fam_n(clu_nu)

0.01920

0.1512

 

Tests of Covariance Parameters
Based on the Residual Pseudo-Likelihood

Label

DF

-2 Res Log P-Like

ChiSq

Pr > ChiSq

Note

Independence

2

10507

0.58

0.4115

MI

1 ACCEPTED SOLUTION

Accepted Solutions
BISTGP
Fluorite | Level 6
I tend to agree with sld. Your model should reflect the data generating process as much as possible. That the random effects are not statistically significant at p < 0.05 is also a sample size issue. Also, and this is more tactics than science, but if you are submitting for publication it will be low-hanging fruit for a reviewer to ask about clustering, so now you have included it.

View solution in original post

3 REPLIES 3
sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

Belatedly, my personal opinion is that the statistical model should mimic the experimental design. So if you have clusters in the experimental design, then you have variance components associated with those clusters, and so you should keep those variance components regardless of whether the estimates are statistically "significant". 

 

BISTGP
Fluorite | Level 6
I tend to agree with sld. Your model should reflect the data generating process as much as possible. That the random effects are not statistically significant at p < 0.05 is also a sample size issue. Also, and this is more tactics than science, but if you are submitting for publication it will be low-hanging fruit for a reviewer to ask about clustering, so now you have included it.
Lao_feng
Obsidian | Level 7

Thank you all for your kind help !

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1404 views
  • 3 likes
  • 3 in conversation