BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Lao_feng
Obsidian | Level 7
 

I am doing a logistic regression analysis with random intercept in the model to account for within cluster correlation.There are 3 levels in my data:

Level 1: individual subject

Level 2: Family (some subjects from the same family), the variable is fam_num

Level 3: Village, the variable clu_num

 

The model included country, age, gender, education and marrital status.

 

My codes are:

 

proc glimmix data=work0 NOCLPRINT;
class  country (ref='Pakistan') clu_num  fam_num age4g(ref='40~49') gender edu2g mar2g ;
model comb2g(ref='0')= country age4g gender edu2g mar2g /solution Link=logit dist=binary
random int /sub=clu_num;
random int /sub=fam_num (clu_num) ;
COVTEST GLM;
run;

 

I used 'COVTEST GLM' to check if the outcome is independent or not within clusters. The results I got are as follow:

The P value is 0.4115, suggesting the clustering effects are not statistically significant. In such case, can I remove the random effect from model, and use standard logistic regression? Thanks

 

Covariance Parameter Estimates

Cov Parm

Subject

Estimate

Standard
Error

Intercept

clu_num

0.01724

0.02710

Intercept

fam_n(clu_nu)

0.01920

0.1512

 

Tests of Covariance Parameters
Based on the Residual Pseudo-Likelihood

Label

DF

-2 Res Log P-Like

ChiSq

Pr > ChiSq

Note

Independence

2

10507

0.58

0.4115

MI

1 ACCEPTED SOLUTION

Accepted Solutions
BISTGP
Fluorite | Level 6
I tend to agree with sld. Your model should reflect the data generating process as much as possible. That the random effects are not statistically significant at p < 0.05 is also a sample size issue. Also, and this is more tactics than science, but if you are submitting for publication it will be low-hanging fruit for a reviewer to ask about clustering, so now you have included it.

View solution in original post

3 REPLIES 3
sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

Belatedly, my personal opinion is that the statistical model should mimic the experimental design. So if you have clusters in the experimental design, then you have variance components associated with those clusters, and so you should keep those variance components regardless of whether the estimates are statistically "significant". 

 

BISTGP
Fluorite | Level 6
I tend to agree with sld. Your model should reflect the data generating process as much as possible. That the random effects are not statistically significant at p < 0.05 is also a sample size issue. Also, and this is more tactics than science, but if you are submitting for publication it will be low-hanging fruit for a reviewer to ask about clustering, so now you have included it.
Lao_feng
Obsidian | Level 7

Thank you all for your kind help !

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1240 views
  • 3 likes
  • 3 in conversation