I need some help with fitting a PO model for the first time... I
I'm evaluating the effects of different factors on the cleanliness of cows.
My response is a scoring of cleanliness on a scale with four levels: Clean=1 -
Covered in dirt=4. I have data from 1362 cows clustered in 44 herds.
I started out with 16 explanatory variables - and after backward elimination
I'm left with 3-4 categorical and one qualitative explanatory variables.
I've checked the PO assumption using PROC LOGISTIC - and the assumption did
not hold (p<0.001). If I excluded one of my variables it came closer (p=0,482)
I then assessed the ordinality of my response with respect to each
explanatory variable by plotting the mean of each explanatory variables
stratified by levels of the response variable. As the means did not differ
between score 1 and 2 it suggested, that these two levels should be merged.
Which I then did - and ran the PROC LOGISTIC again (with the 'suspicious'
variable included again). Now the proportional odds assumption was closer to
hold - p=0,0302.
My question - for now... - are:
* Can I continue with the PO model
* Does the test of the PO assumption in PROC LOGISTIC hold when I can't take
into account the clustering of my data?
* And how do I assess model fit using PROC GENMOD (SAS 9.1)?
This became a very long story - I hope that some of you can find the time to
help me out!
I would note that the example which is provided in the above link does not take into account any cluster effects. However, the NLMIXED procedure will allow estimation of herd random efffects. Look at the documentation of the NLMIXED procedure to learn more about how to incorporate random effects in your analysis.
After you have read the note about fitting a PPO model and the documentation of the NLMIXED procedure, you might want to post again to this forum with some revised problem statement.
Hi Dale - and any other smart people out there ;o),
Thanks for your answer!
I actually tried that, using GENMOD... I have been working my way through Stokes (Categorical Data Analysis, pp533), but are having problems as my sample size are not adequate. In a two-way cross-classification of my explanatory variables the count for one cell is only two. And then the model crashes...
Is that different in PROC NLMIXED?
Besides, according to Harrell (Regression Modeling Strategies, chap 13, p. 335) the PO test is extremely anti-conservative. The p-value of the test was ~0.03 - and I wonder whether this could be acceptable anyway? PROC LOGISTIC does not allow me to take the clustering of my data into account - how does that affect the PO test?
I have not seriously investigated how the PPO model is fit using the GENMOD procedure. However, I do know that you have to employ an altered construction of the data and that PPO model estimation employing the GENMOD procedure then requires a GEE model to attempt to account for the covariance of response levels which arises from a multinomial distribution. But the GEE only approximates the covariance of the response levels. Moreover, the GEE which is employed to account for the covariance arising from a multinomial response requires specification of cow as the subject. But you wanted to use a GEE to account for correlations between cows due to clustering in herds. Thus, the PPO model fit employing the GENMOD procedure does not allow appropriate modeling of all of the correlations which appear in the expanded data.
The NLMIXED procedure does not require an altered data construction AND it does fit the multinomial model. Covariances between response levels are accounted for in the likelihood maximization process. Moreover, you can incorporate random herd effects to account within-herd correlated responses. When you fit the PPO (and PO) models employing NLMIXED, you test the proportional odds assumption employing a likelihood ratio test rather than a score test. I have not studied score and likelihood ratio tests for testing the proportional odds assumption. However, my guess is that a likelihood ratio test would have better properties than the score test.
I can't guarantee that you won't have problems fitting the PPO model employing the NLMIXED procedure. However, the NLMIXED procedure has much going for it over the GENMOD procedure for estimating a PPO model.