- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Dear SAS Community,
I am running a proc logistic model to compare the rotting percentage between avocado varieties. Since the outcome variable can only take two values (0 or 100%) I am analyzing it as a binomial var. I am using the firth option because otherwise I will get this warning: There is possibly a quasi-complete separation of data points. The maximum likelihood
estimate may not exist.
title2 'PercStemEndRot: Comparing varieties within Weeks for each Harvest across Season';
proc logistic data=one desc;
class Harvest Variety Wks/param=glm;
model PercStemEndRot=Harvest*Variety*Wks/firth;
slice Harvest*Variety*Wks/sliceby=Harvest*Wks adjust=simulate(seed=1);
run;
When using the firth option I got this warning: Ridging has failed to improve the loglikelihood. You may want to use a different ridging technique (RIDGING= option), or switch to using linesearch to reduce the step size
(RIDGING=NONE), or specify a new set of initial estimates (INEST= option).
Is there anything I could do to bypass this issue other than eliminating dep variables or interactions?
Thank you very much!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
First, your model specification is not producing the separate main effects and 2-way interactions. It only contains the 3-way interaction. However, all the degrees of freedom of the main effects and 2-way interactions are included in the 3-way interaction using that specification. So, it is effectively equivalent to writing
model PercStemEndRot=Harvest|Variety|Wks
which is the shorthand way to specify all of the main effects and interactions.
But the bottom line is that you'll need to simplify the model because the model complexity using all the effects (either explicitly or implicitly specified as noted above) is making the data too sparse. I suggest that you start with only the main effects model to see if it is successful:
model PercStemEndRot=Harvest Variety Wks
and then add interactions one at a time as long as the fit succeeds - initially without FIRTH and then adding it if needed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
First, your model specification is not producing the separate main effects and 2-way interactions. It only contains the 3-way interaction. However, all the degrees of freedom of the main effects and 2-way interactions are included in the 3-way interaction using that specification. So, it is effectively equivalent to writing
model PercStemEndRot=Harvest|Variety|Wks
which is the shorthand way to specify all of the main effects and interactions.
But the bottom line is that you'll need to simplify the model because the model complexity using all the effects (either explicitly or implicitly specified as noted above) is making the data too sparse. I suggest that you start with only the main effects model to see if it is successful:
model PercStemEndRot=Harvest Variety Wks
and then add interactions one at a time as long as the fit succeeds - initially without FIRTH and then adding it if needed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What does running a proc freq of this sort tell you:
proc freq data=
tables harvest*wks*variety*PercStemEndRot/cmh;run;
The CMH option should give you a test for association of variety with PercStemEndRot, after adjusting for harvest and wks. In addition, it should let you know where the zeroes are in your data. Consolidating categories is probably the best way to handle this.
SteveDenham
(I can't believe I am not offering some sort of exact approach to a generalized linear model, but I think this has two advantages - you will know where the zeroes are, and I believe you will still get some useful inferential information).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much StatDave, I will do that.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you Steve! So if the general association is significant if means that there is an effect of Variety in the PercStemEndRot?
Cochran-Mantel-Haenszel Statistics (Based on Table Scores) | ||||
---|---|---|---|---|
Statistic | Alternative Hypothesis | DF | Value | Prob |
1 | Nonzero Correlation | 1 | 27.1483 | <.0001 |
2 | Row Mean Scores Differ | 10 | 253.3270 | <.0001 |
3 | General Association | 10 | 253.3270 | <.0001 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks StatDave