turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- ZEROMODEL in PROC GENMOD vs. PROC LOGISTIC

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-03-2015 01:37 PM

I am currently fitting a ZINB model in PROC GENMOD. I actually posted a different question related to the same model here; I am keeping this question separate because it is a different issue than what I raised in the other thread and think it will be easier to keep them separate.

Anyway, say I have the following model in PROC GENMOD:

**PROC** **GENMOD** data=dat_analysis;

class Treatment(ref="Control") / param=ref;

model UAVI = Treatment / dist=zinb offset=logTI;

zeromodel;

**run**;

Specifically, you can see that I fit a null (i.e. intercept only) zeromodel to this data. The output for the zero inflation parameter is:

Intercept Estimate: -1.3971

Intercept St. Err: 0.1667

According to the SAS documentation, this parameter is modelled as:

Where h is the logit link function, and in this instance the right hand side of the equation consists solely of the intercept.

However, if I create a binary indicator variable coding 0s and non-zeros, and run a logistic regression on that outcome with no covaraites, I get completely different results. So, using the following code:

**DATA** test;

set dat_analysis;

if UAVI=**0** then zero=**1**; else zero=**0**;

**run**;

**PROC** **LOGISTIC** data=test;

model zero(event='1') = ;

**run**;

Then, the parameter estimate for the probaility of a zero is:

Intercept Estimate: -0.7621

Intercept St. Err: 0.1079

As you can see, these are radically different results. What accounts for this?

On a related note, I've also noticed that the results for the zero inflation parameter change when I add covariates to the MODEL statement in PROC GENMOD, even if the zeromodel is still specified as null, as in the following:

**PROC** **GENMOD** data=dat_analysis;

class Treatment(ref="Control") / param=ref;

model UAVI = Treatment T6 / dist=zinb offset=logTI;

zeromodel;

**run**;

Now, the parameter estimate are:

Intercept Estimate: -1.3943

Intercept St. Err: 0.1663

But why would the results change? I've checked that the issue isn't related to missing values (i.e. all of these models are using the same exact pool of individuals). The parameters in the model statement shouldn't impact the fit of the zero model, since it is still a null model fit to the same number of subjects. And I don't understand why the PROC LOGISTIC model gives different results, when the documentation indicates that the method used in GENMOD is equivalent.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-03-2015 02:42 PM

As a follow-up to the original post, I have found some elucidation in using PROC FMM. If I fit the following two models:

**PROC** **FMM** data=dat_analysis;

class Treatment;

model UAVI = Treatment / dist=truncnegbin offset=logTI;

model + / dist=Constant;

**run**;

**PROC** **FMM** data=analysis;

class Treatment;

model UAVI = Treatment / dist=negbinomial offset=logTI;

model + / dist=Constant;

**run**;

The first model (which is a negative binomial hurdle model) gives me the exact same estimated zero probability parameter as PROC LOGISTIC.The second model agrees completely with the output of PROC GENMOD's ZINB fit. So the answer lies in the difference between these two models, somewhere, but as of yet I have not been able to figure out the cause of the difference or its ramifications.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-03-2015 04:02 PM

The zinb model in GENMOD is not a hurdle model, but a more general mixture model. The defined model *includes* the zero term; then you have the added zero component term (a distribution defined only at 0). The probability of 0 is related to the sum of the probabilities from the model statement and the zero term statement. If you fit just a null model, you would not get the same results.

In your first FMM run, by using a trucated NB, you are not getting the probability of 0 for the first component. Thus, it is a different model.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-07-2015 11:17 AM

Isn't it the other way around, though? The results from my truncated negative binomial model AGREE with the results of my null model, whereas the results of the zero-inflated model DISAGREE with the results of the null model.

The SAS documentation explicitly claims that the zero-probabilities calculated for a zero-inflated model are using a logistic regression, but the results of fitting that regression produce incompatible results. Those results only correspond to the zero-probabilities calculated by the hurdle model. So there appears to be a discrepency between the way the documentation claims the model works versus how it is working in practice.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-07-2015 12:10 PM - edited 12-07-2015 12:17 PM

I guess I disagree. I think your results make perfect sense. THe results of fitting the mixture model will certainly disagree with the fit from the null model (because both portions of the mixture are giving part of the zero prediction).

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-07-2015 12:30 PM - edited 12-07-2015 12:34 PM

Then the SAS documentation is incorrect? This is where my confusion lies; the way it is described in SAS implies that it SHOULD agree with the null model when it clearly does NOT. That is why I am asking for clarification on the issue. The SAS documentation implies that using the "zeromodel" statement with no specified effects is equivalent to fitting the null model, which clearly isn't the case.