BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
bhr-q
Pyrite | Level 9

Hello All,

 

I ran a linear mixed model (LMM) with country (55 countries) included as a random intercept. The random intercept for country was statistically significant, and the model fit improved significantly, evidenced by a lower -2 Log Likelihood—compared to the model without country as a random effect.

 

proc MIXED data=tmp method=ML covtest;
class country;
model dependent_var =var1 var2 ..../s ddfm=kr ;
random intercept /subject=country;
run;

 

The concern is that 22 countries have only one respondent and countries have respondents (below is the frequency), I was thinking to say: Even though the model with country as a random intercept looks better fit, but I will go with simple linear regression not mixed model due to sparse data/unstable estimate. Or would it be better to run the GEE model with an independent correlation structure?

 

proc genmod data=tmp;
class country ;
model ave_score =var1 var2 .... / dist=normal link=id type3;
repeated subject=country/ type=ind;
run;
country Frequency
country_1 1
country_2 1
country_3 1
country_4 24
country_5 1
country_6 2
country_7 1
country_8 5
country_9 22
country_10 2
country_11 1
country_12 1
country_13 20
country_14 12
country_15 1
country_16 1
country_17 1
country_18 1
country_19 1
country_20 2
country_21 2
country_22 1
country_23 2
country_24 1
country_25 6
country_26 1
country_27 3
country_28 4
country_29 32
country_30 5
country_31 1
country_32 6
country_33 15
country_34 10
country_35 1
country_36 1
country_37 18
country_38 2
country_39 3
country_40 2
country_41 1
country_42 3
country_43 6
country_44 8
country_45 1
country_46 2
country_47 2
country_48 7
country_49 6
country_50 18
country_51 5
country_52 1
country_53 1
country_54 9
country_55 3
total 290

 

I would appreciate your help in choosing the best approach,

Thanks so much!

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

Both approaches can deal with the structure of your data. The random effects model in MIXED is a subject-specific model best for individual predictions. The GEE model is a marginal or population-averaged model that is best for making population inferences.  But as noted by Allison in his book, "Fixed Effects Regression Methods for Longitudinal Data Using SAS"  (Allison, P., SAS Institute, 2005), these are effectively equivalent in the case of the linear model like yours, though you might want to use the exchangeable structure (TYPE=EXCH) in the GEE model. Another possible approach is the fixed effects model that Allison's book also discusses and which is implemented by the ABSORB statement in PROC GLM. 

 

Note that the recommended procedure for fitting the GEE model is now PROC GEE though GENMOD can certainly be used. Also, the GEE model does not use a likelihood-based approach, so model comparisons using the likelihood or measures like AIC are not possible.

View solution in original post

2 REPLIES 2
StatDave
SAS Super FREQ

Both approaches can deal with the structure of your data. The random effects model in MIXED is a subject-specific model best for individual predictions. The GEE model is a marginal or population-averaged model that is best for making population inferences.  But as noted by Allison in his book, "Fixed Effects Regression Methods for Longitudinal Data Using SAS"  (Allison, P., SAS Institute, 2005), these are effectively equivalent in the case of the linear model like yours, though you might want to use the exchangeable structure (TYPE=EXCH) in the GEE model. Another possible approach is the fixed effects model that Allison's book also discusses and which is implemented by the ABSORB statement in PROC GLM. 

 

Note that the recommended procedure for fitting the GEE model is now PROC GEE though GENMOD can certainly be used. Also, the GEE model does not use a likelihood-based approach, so model comparisons using the likelihood or measures like AIC are not possible.

bhr-q
Pyrite | Level 9
Thanks so much for your answer, it was helpful, The reason I used an independent correlation structure is that, if cluster size is informative, GEE models with exchangeable correlation can produce biased estimates.
https://pubmed.ncbi.nlm.nih.gov/37439089/
https://pmc.ncbi.nlm.nih.gov/articles/PMC9908044/

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1455 views
  • 2 likes
  • 2 in conversation