09-05-2014 05:33 AM
I am currently analysing data of hens which have been checked for wounds on their body. For each hen 4 locations have been sampled (Location_) at the beginning and the end of an experiment (Date), data (score) is binary as in 1=wound present, 0=no wound present. What i want to know is if the hen's health has improved over time (date) and if this is correlated with the amount of visits they paid to an outdoor enclosure.
My data looks like this:
Currently the model i am working with looks like this:
proc glimmix data=veer method=laplace;
class date id location_ score;
model score(event='1')= visits Location_|date/ dist=binary link=logit solution;
random location_/ subject=id type=vc;
random date/ subject=id type=ar(1);
slice Location_*date/ sliceby=Location_ diff;
The output regarding my researchquestions it provides is estimates for the independent variables and the following tables which as far as i understand compare the score at two different times for the separate locations; In this case scores for the location head are not significantly different.
My questions are:
- have i treated the doubly repeated measures (different locations on same hen at different times) correctly?
- In doing so have i specified the covariance matrix types in the random statements correctly? (I doubt if type=cv is correct for the location statement and if i should not rather use type=chol)
- And do I interpret the output generated by the slice statement in the right manner? (as described above)
Thank you in advance,
09-05-2014 09:09 AM
Looks like a good approach. I do wonder about selecting VC as the covariance structure for the locations. If you have lots of data, consider type=UN (or type=CHOL), to model the covariance between locations on the birds. VC assumes that there is no correlation between locations, and you may have to make that assumption to get convergence. With 4 locations, type=CHOL would estimate 4 variances and 6 covariances, so to get good estimates of 10 parameters, the rule of thumb of 10 observations per parameter would mean 100 obs, and with a binomial distribution that usually means 100 observations in the category of interest--per time point. Given all that, I would say you have modeled this well, given the amount of data in hand.
09-08-2014 07:29 AM
Thank you for your responce. I have used your suggestions and settled on type=chol as a suitable type, as type= UN set some of my DF's to infinity. The only concern i have, using type=chol, is that when using this specification the results from the type 3 test indicate that the interaction 'Location_*date' is not significant. I cannot remember to have read this somewhere before but i can imagine that it is not allowed to perform a contrast using the slice statement to determine significant differences in time per location when the interaction is not significant.
Hope you could comment on this.
09-08-2014 08:43 AM
If the interaction is not significant, there is no strong evidence to reject the null that all of the means specified by the interaction are equal, and thus, slicing to look for differences could be considered data dredging.
If you have specific, pre-specified comparisons that you wish to make, and only those comparisons, you might consider setting up LSMESTIMATE statements with some sort of adjustment for multiple comparisons. This would be my choice in a confirmatory experiment. This looks more exploratory, and thus, I would look at that type 3 test as a strong indicator that comparisons of lsmeans should be avoided.
09-08-2014 10:37 AM
Okay, I suspected that. Then i have one last question regarding type vc and type=chol (both give viable output using my dataset). After closer inspection is it seems my dataset contains fewer observations then is needed for type=chol. So when you specify that: 'VC assumes that there is no correlation between locations", then you indicate that while using this the model assumes that the healthscores of the different locations are not related, and thus not grouped by the same animal. Or does it indicate that they are grouped on the same animal but changes in one location are not dependent on changes in another location?
09-08-2014 10:47 AM
The latter. VC assumes that the variability in one location is independent of the variablity in another. For instance, appearance of lesions on the head are completely independent of lesions on the back. My natural inclination is that animals that are wound susceptible will have similar scores no matter where on the animal. If the relationship is anatomical, then some unstructured relationship is probably best. A heterogeneous compound symmetry structure might be appropriate (type=CSH). Take a look at the documentation for the type= option, especially the matrix representations and see if that might be an appropriate structure.