turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- How to have "Fixed Effects" and "Cluster Robust St...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-07-2012 04:40 PM

Dear all,

I am running into a big problem trying to have administrative region

fixed effects and account for cluster robust standard error.

The datset i am using for the research collects data using multiple-

stage sampling.

The sampling of clusters in districts, communes, enumeration areas at

the first stage and then selecting households within each cluster

represents multiple-stage stratified sampling design which is not

perfectly random.

This would underestimate my SE and I would like to have robust

standard error in the model to fix the problem.

The model I run:

proc genmod data=xlucky descending ;

class districtid(param=ref);

model (Binary Dependent Variable) = (explanatory variables)

/ dist=binary link =logit ;

repeated subject=districtid/type=cs corrw;

run;

This code give me all the parameter estimates and robust standard

errors.

HOWEVER, when I run:

proc genmod data=xlucky descending ;

class districtid(param=ref);

model (Binary Dependent Variable) = (explanatory variables districtid)

/ dist=binary link =logit ;

repeated subject=districtid/type=cs corrw;

run;

To have fixed effect and the RSE, the error massage pops up:

WARNING: The negative of the Hessian is not positive definite. The

convergence is questionable.

WARNING: The procedure is continuing but the validity of the model fit

is questionable.

WARNING: The specified model did not converge.

Any idea how to get this right?

Same problem happens when I run proc glimmix.

Thank you for your help.

WL

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to WLEE

05-08-2012 07:35 AM

The GENMOD error arises, I think, from the use of GEEs to estimate the within cluster variability, when districtid is being used in two ways. Could you share the GLIMMIX code that gives the same error? I feel a lot more comfortable commenting on errors in GLIMMIX, as I use it a lot more than GENMOD.

Moving on, and based on some of the info, I may be answering the wrong question here, but have you considered PROC SURVEYLOGISTIC?

Would the following give anything like what you are looking for:

proc surveylogistic data=xlucky ;

class districtid(param=ref);

model binary_dependent_variable (descending) = explanatory_variables districtid;

cluster districtid;

weight <NEED A VARIABLE HERE>;

run;

This would require some sort of weighting variable to reflect the proportions sampled.

This code could be modified to reflect the multiple levels of sampling.

Good luck with this.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

05-08-2012 09:51 AM

Dear Steve Denham,

Thank you for your very helpful reply.

The Glimmix I fit was:

proc glimmix data =xlucky ;

class districtid ;

model binary_dependent_variable (descending) = explanatory_variables districtid

/solution dist=binary link=logit ;

random intercept/subject=districtid;

random _residual_ ;run;

I am not sure to include both G and R random effects in my model, but that was what I did anyway.( I am using a survey that use multiple-staged sampling, do I have base to assume that there are both random effects?)

This code gives me the error:

NOTE: Did not converge.

and gives me no parameter estimates.

Maybe what I should do is to follow your suggestion and use proc surveylogistic to run my regression.

I also have question regarding surveylogistic. That is:

What would happen if I do not include "weight" command in the model?

What is wieght? number of population in each district / total population?

Thank you again for your valuable insights on this.

WL

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to WLEE

05-08-2012 11:12 AM

Working from the bottom up:

It looks like there is more than just WEIGHT to consider. Multistage sampling means looking at the primary sampling rate and total number of primary sampling units. That gets explained fairly well in the documentation. For examples with a continuous response variable, check PROC SURVEYREG documentation. I think examples 90.4 and 90.5 can be converted to SURVEYLOGISTIC as a guide.

On to GLIMMIX.

"Did not converge" can happen a lot of ways. With no other messages, it may be that you need more iterations or to slightly relax the convergence criteria. See the NLOPTIONS statement for guidance in these areas.

My opinion is that the R side effects may not be needed. It might be better to accommodate the multiple stage sampling in G side effects. The secondary sampling units would have to be specified as a class variable, but not included in the model statement. Something like:

proc glimmix data =xlucky ;

class districtid secondid;

model binary_dependent_variable (descending) = explanatory_variables districtid

/solution dist=binary link=logit ;

random intercept districtid/subject=secondid solution;

run;

But the more I think about this, the more I believe that the SURVEY procs are where you need to be looking.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

05-08-2012 03:24 PM

Thank you again for your comments. It looks like the proc surveylogistic is the way to go.

Thank you for the help.

WL