BookmarkSubscribeRSS Feed
mgerritsen
Calcite | Level 5

Hi,

I am currently analysing data of hens which have been checked for wounds on their body. For each hen 4 locations have been sampled (Location_) at the beginning and the end of an experiment (Date), data (score) is binary as in 1=wound present, 0=no wound present. What i want to know is if the hen's health has improved over time (date) and if this is correlated with the amount of visits they paid to an outdoor enclosure.

My data looks like this:

DateIdLocationvisitsScore
11Head00
11Back00
11Belly01
11Crest01
21

head

380
21back381
21belly381
21crest380
12head01
12back01
12bely01
12crest00
22head171
22back170
22belly171
22crest171
1

Currently the model i am working with looks like this:

proc glimmix data=veer method=laplace;

  class date id location_ score;

  model score(event='1')= visits Location_|date/ dist=binary link=logit solution;

  random location_/ subject=id type=vc;

  random date/ subject=id type=ar(1);

  slice Location_*date/ sliceby=Location_ diff;

  run;

The output regarding my researchquestions it provides is estimates for the independent variables and the following tables which as far as i understand compare the score at two different times for the separate locations; In this case scores for the location head are not significantly different.

Untitled.png

My questions are:

- have i treated the doubly repeated measures (different locations on same hen at different times) correctly?

- In doing so have i specified the covariance matrix types in the random statements correctly? (I doubt if type=cv is correct for the location statement and if i should not rather use type=chol)

- And do I interpret  the output generated by the slice statement in the right manner? (as described above)

Thank you in advance,

Marielle Gerritsen

5 REPLIES 5
SteveDenham
Jade | Level 19

Hi Marielle,

Looks like a good approach.  I do wonder about selecting VC as the covariance structure for the locations.  If you have lots of data, consider type=UN (or type=CHOL), to model the covariance between locations on the birds.  VC assumes that there is no correlation between locations, and you may have to make that assumption to get convergence.  With 4 locations, type=CHOL would estimate 4 variances and 6 covariances, so to get good estimates of 10 parameters, the rule of thumb of 10 observations per parameter would mean 100 obs, and with a binomial distribution that usually means 100 observations in the category of interest--per time point.  Given all that, I would say you have modeled this well, given the amount of data in hand.

Steve Denham

mgerritsen
Calcite | Level 5

Hi Steve,

Thank you for your responce. I have used your suggestions and settled on type=chol as a suitable type, as type= UN set some of my DF's to infinity. The only concern i have, using type=chol, is that when using this specification the results from the type 3 test indicate that the interaction 'Location_*date' is not significant. I cannot remember to have read this somewhere before but i can imagine that it is not allowed to perform a contrast using the slice statement to determine significant differences in time per location when the interaction is not significant.

Hope you could comment on this.

Marielle

SteveDenham
Jade | Level 19

If the interaction is not significant, there is no strong evidence to reject the null that all of the means specified by the interaction are equal, and thus, slicing to look for differences could be considered data dredging.

If you have specific, pre-specified comparisons that you wish to make, and only those comparisons, you might consider setting up LSMESTIMATE statements with some sort of adjustment for multiple comparisons.  This would be my choice in a confirmatory experiment.  This looks more exploratory, and thus, I would look at that type 3 test as a strong indicator that comparisons of lsmeans should be avoided.

Steve Denham

mgerritsen
Calcite | Level 5


Okay, I suspected that. Then i have one last question regarding type vc and type=chol (both give viable output using my dataset). After closer inspection is it seems my dataset contains fewer observations then is needed for type=chol. So when you specify that: 'VC assumes that there is no correlation between locations", then you indicate that while using this the model assumes that the healthscores of the  different locations are not related, and thus not grouped by the same animal. Or does it indicate that they are grouped on the same animal but changes in one location are not dependent on changes in another location?

SteveDenham
Jade | Level 19

The latter.  VC assumes that the variability in one location is independent of the variablity in another.  For instance, appearance of lesions on the head are completely independent of lesions on the back.  My natural inclination is that animals that are wound susceptible will have similar scores no matter where on the animal.  If the relationship is anatomical, then some unstructured relationship is probably best.  A heterogeneous compound symmetry structure might be appropriate (type=CSH).  Take a look at the documentation for the type= option, especially the matrix representations and see if that might be an appropriate structure.

Steve Denham

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1368 views
  • 6 likes
  • 2 in conversation