BookmarkSubscribeRSS Feed
yuchinher
Calcite | Level 5

Hi,

 

I have a 4-level data structure that looks like the following:

 

Level 1: Waves

Level 2: Individuals 

Level 3: Family

Level 4: Neighbourhood

 

Level 1 is nested in level 2, level 2 is nested in level 3, and level 3 is nested in level 4. 

I would like to have fixed effects at the neighbourhood level (using neighbourhood IDs) with random effects at the family and individual level. I tried both GLIMMIX (random intercept / subject statements) and GENMOD (repeated / subject statements) but it seems that because I have many neighbourhood IDs and a large number of observations, I kept running into errors and memory issues.

I was suggested that, alternatively, I could have fixed effects at the neighbourhood level with a SAS option that correct the standard errors for the fixed parameters for clustering (i.e., family and individual level). But I am not sure how to do this in SAS. In Stata, there is a "vce cluster" option in simple logistic regression procedure. I wonder if there is anything similar in SAS? 

I also saw from previous posts/comments to use the empirical statement in GLIMMIX. I tried but I think in my case it is not really useful (you still need the subject statements which seemed to be too complex to handle together with the FE in this case). 

Any suggestions on how to model this? Thank you very much!

 

 

5 REPLIES 5
jiltao
SAS Super FREQ

First, what is your memsize? Please run the following code and send us the log --

proc options option=memsize value;

run;

Second, there are ways to write a more numerically efficient PROC GLIMMIX program. What is your current PROC GLIMMIX program? 

Thanks,

Jill

yuchinher
Calcite | Level 5

Hi Jill,

 

The memory size is 40 gig.

This is the GLIMMIX code:

 
proc glimmix data = long method=lapalce noclprint;
class clustervar_l4 clustervar_l3 clustervar_l2;
model event(event='1')= clustervar_l4 dur dur2 age sex income hhmember break / solution ddfm=bw link=logit dist=binary;
     random intercept / sub=clustervar_l3;
     random intercept / sub=clustervar_l2(clustervar_l3);
parms / lowerb=1e-4,.,1e-4 noiter;
covtest / wald; 
run;

Thanks for helping!


jiltao
SAS Super FREQ

a couple of suggestions and one question about your code --

suggestion 1. renumber the clustervars so they are truly nested.

    For example, if these are your values --

    clustervarl2 clustervarl3

      1                1

      2                1

      3                1

      4                2

      5                2

      change the values so they are now:

clustervarl2 clustervarl3

      1                1

      2                1

      3                1

      1                2

      2                2

 

suggestion 2: try method=quad(fastquad) option in the PROC GLIMMIX statement. Also try ddfm=residuals in the MODEL statement.

 

Question 1: I wonder why you have the NOITER option in the PARMS statement, and no parameter values are specified there....

 

Good luck with these tries!

Jill

 

yuchinher
Calcite | Level 5

Hi Jill,

 

Thank you for your suggestions! I am indeed having the first clustervar specification. May I ask why the second option will be better for the model to run (why you suggested to change the values)? 

I have tried the quad(fastquad) option before but it didn't seem to help that much. I am trying the ddfm=residual method and will let you know!

 

To be honest, I am not so sure how to use the parms and noiter statements, but the specifications seemed to help the model to run more efficiently? 

Do you know if PHREG can be used for discrete-time event history analysis I am having, or can it only be used for continuous time? I am trying to also look for other procedures that can help with what I want...

 

Many thanks!

Yu-Chin

 

 

jiltao
SAS Super FREQ

changing the subject values so they are truly nested can reduce the number of levels for clustervar_l2 and therefore reduce the run time. This approach is illustrated in the following usage note --

http://support.sas.com/kb/37057

Fastquad is only helpful when you have hierarchical random effects, which is what you have here. The following usage note might be helpful --

http://support.sas.com/kb/60666

The NOITER option is often used when you know the covariance parameter estimates and therefore do not wish PROC GLIMMIX to estimate it iteratively. You might want to take out your PARMS statement for now.

Hope this helps,

Jill

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 747 views
  • 1 like
  • 2 in conversation