Hi,
I have a 4-level data structure that looks like the following:
Level 1: Waves
Level 2: Individuals
Level 3: Family
Level 4: Neighbourhood
Level 1 is nested in level 2, level 2 is nested in level 3, and level 3 is nested in level 4.
I would like to have fixed effects at the neighbourhood level (using neighbourhood IDs) with random effects at the family and individual level. I tried both GLIMMIX (random intercept / subject statements) and GENMOD (repeated / subject statements) but it seems that because I have many neighbourhood IDs and a large number of observations, I kept running into errors and memory issues.
I was suggested that, alternatively, I could have fixed effects at the neighbourhood level with a SAS option that correct the standard errors for the fixed parameters for clustering (i.e., family and individual level). But I am not sure how to do this in SAS. In Stata, there is a "vce cluster" option in simple logistic regression procedure. I wonder if there is anything similar in SAS?
I also saw from previous posts/comments to use the empirical statement in GLIMMIX. I tried but I think in my case it is not really useful (you still need the subject statements which seemed to be too complex to handle together with the FE in this case).
Any suggestions on how to model this? Thank you very much!
First, what is your memsize? Please run the following code and send us the log --
proc options option=memsize value;
run;
Second, there are ways to write a more numerically efficient PROC GLIMMIX program. What is your current PROC GLIMMIX program?
Thanks,
Jill
Hi Jill,
The memory size is 40 gig.
This is the GLIMMIX code:
a couple of suggestions and one question about your code --
suggestion 1. renumber the clustervars so they are truly nested.
For example, if these are your values --
clustervarl2 clustervarl3
1 1
2 1
3 1
4 2
5 2
change the values so they are now:
clustervarl2 clustervarl3
1 1
2 1
3 1
1 2
2 2
suggestion 2: try method=quad(fastquad) option in the PROC GLIMMIX statement. Also try ddfm=residuals in the MODEL statement.
Question 1: I wonder why you have the NOITER option in the PARMS statement, and no parameter values are specified there....
Good luck with these tries!
Jill
Hi Jill,
Thank you for your suggestions! I am indeed having the first clustervar specification. May I ask why the second option will be better for the model to run (why you suggested to change the values)?
I have tried the quad(fastquad) option before but it didn't seem to help that much. I am trying the ddfm=residual method and will let you know!
To be honest, I am not so sure how to use the parms and noiter statements, but the specifications seemed to help the model to run more efficiently?
Do you know if PHREG can be used for discrete-time event history analysis I am having, or can it only be used for continuous time? I am trying to also look for other procedures that can help with what I want...
Many thanks!
Yu-Chin
changing the subject values so they are truly nested can reduce the number of levels for clustervar_l2 and therefore reduce the run time. This approach is illustrated in the following usage note --
http://support.sas.com/kb/37057
Fastquad is only helpful when you have hierarchical random effects, which is what you have here. The following usage note might be helpful --
http://support.sas.com/kb/60666
The NOITER option is often used when you know the covariance parameter estimates and therefore do not wish PROC GLIMMIX to estimate it iteratively. You might want to take out your PARMS statement for now.
Hope this helps,
Jill
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.