BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sb16031
Fluorite | Level 6

Hi Folks, 

I apologize in advance if this is a newbie question, I'm still relative unfamiliar with SAS and overall I don't have a super strong statistics background. I'm trying to analyze an animal feeding trial that someone else executed. Here is the study description: 

  • There are a number of birds, housed in 14 blocks.
  • Each block contains a number of pens, for a total of 120 pens.
  • Each pen contains 6-8 birds (the number is variable, as some may die during the trial).
  • The variables that were measured at each time point were BW (body weight) and FI (cumulative feed intake); these are calculated pen-wise: for BW, all birds in the pen are weighed and the total weight is divided by the number of birds; the same is done for FI. 
  • The trial length was 43 days, and measurement were taken at day 0,14,21,28,43
  • There were 8 treatments:
    • T1 is a control (no disease, no supplements);
    • T2 the birds were challenged with a disease;
    • T3 the birds were challenged and also received a commercial antibiotic;
    • T4 the birds were challenged and received no antibiotic, but instead a supplement at a rate of 0.125 g/ton;
    • T5 is the same as TR4 but the rate was 0.250 g/ton;
    • T6 as above, but rate was 0.5 g/ton;
    • T7 same as above but the rate was 0.75 g/ton;
    • T8 same as above but rate was 1 g/ton. 
  • All of the treatments have the same number of PENs 

I have set up the dataset to be the following way: 

PENBLOCKTRTDAYABCHALLENGESUPPLVARVALUE
11T40NOYES0.125BW43
11T414NOYES0.125BW33
...........................
11T414NOYES0.125FI525

 

Where DAY is the number of days elapsed, AB is whether they're receiving antibiotic or not (only T3 received it), CHALLENGE is whether they were challenged or not (all except T1 were challenged), and SUPPL is the amount of supplement received (0 for T1-T3, increasingly higher for T4-T8). 

 

The way I see this, it's a simple linear regression, where I'm trying to model the effect of the supplement overall and by day. I tried running a proc GLM as highlighted below:

ods graphics on;
PROC SORT DATA=CHICKS; BY VAR CHALLENGE SUPPL;
PROC GLM DATA=CHICKS PLOTS=DIAGNOSTICS; BY VAR;
CLASS DAY AB CHALLENGE SUPPL;
MODEL VALUE = DAY SUPPL DAY*SUPPL AB CHALLENGE;
LSMEANS DAY SUPPL DAY*SUPPL/PDIFF=all;
RUN;
ods graphics off;
QUIT;

 

However, I don't think this is the best approach, for several reasons: 

  1. I don't know how to model the RANDOM effect of the block (or if I should at all)
  2. I'm not modeling the REPEATED measures (the same birds were measured over time from the same PEN). 
  3. The time points are unequally spaced, and I think PROC GLM doesn't like that.

Is there a better way to do this? Thank you in advance for all your feedback. 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Mike_N
SAS Employee

You have a lot going here, but I'll try to point you in the right direction. To answer your first two questions - you have repeated measurements on the same units (i.e., pens) over time, and you need to account for the (likely) case that those repeated measurements will be correlated. If you were to use the model you have presented (ordinary linear regression), you would be implicitly assuming that all of your observations are independent, with no correlation among observations. Because of your study design, you cannot make that assumption. There are lots of ways to specify a model that accounts for correlation among measurements, and you should start by reading the documentation for PROC MIXED and PROC GLIMMIX. To answer your third question - it should not be a problem that the measurements are unequally spaced. 

View solution in original post

5 REPLIES 5
Mike_N
SAS Employee

You have a lot going here, but I'll try to point you in the right direction. To answer your first two questions - you have repeated measurements on the same units (i.e., pens) over time, and you need to account for the (likely) case that those repeated measurements will be correlated. If you were to use the model you have presented (ordinary linear regression), you would be implicitly assuming that all of your observations are independent, with no correlation among observations. Because of your study design, you cannot make that assumption. There are lots of ways to specify a model that accounts for correlation among measurements, and you should start by reading the documentation for PROC MIXED and PROC GLIMMIX. To answer your third question - it should not be a problem that the measurements are unequally spaced. 

sb16031
Fluorite | Level 6

Thank you very much for your help, and apologies for the delay in replying. I took some time to review the documentation for MIXED and GLIMMIX. I ended up analyzing two ways. 

The first one with effect of AB, CHALLENGE and the SUPPL dose: 

 

PROC SORT DATA=CHICKS; BY VAR;
PROC GLIMMIX DATA=CHICKS; BY VAR;
CLASS DAY PEN AB CHALLENGE SUPPL;
MODEL VALUE = DAY|SUPPL AB CHALLENGE;
RANDOM DAY / SUBJECT=PEN TYPE = unr residual;
RUN;

 

The second one with just the effect of TRT and the interaction with DAY

PROC SORT DATA=CHICKS; BY VAR;
PROC GLIMMIX DATA=CHICKS; BY VAR;
CLASS DAY PEN TRT;
MODEL VALUE = DAY|TRT;
RANDOM DAY / SUBJECT=PEN TYPE = unr residual;
SLICE DAY*TRT / sliceby=DAY lines ADJUST=tukey;
RUN;

 

Do you see anything glaringly wrong with this? 

Thank you again. 

 

 

Mike_N
SAS Employee

This looks like you are on the right track. On quick review, I have two thoughts. First, consider whether you want to include a random intercept in the model. It's not necessarily required, just think about whether you expect random variation in the outcome variable at baseline, and whether you want to capture that variation in your model. Second, I can't tell exactly how your data is structured, but you might need to modify the subject= argument in the random statement. If it's the case that each pen has its own unique identifier, then what you have looks fine. But if the pen numbers are nested within block, i.e., there is a pen numbered 1 in block 1, block 2, etc., then consider subject = pen(block). 

sb16031
Fluorite | Level 6

Thank you for your very prompt feedback! To answer your questions:

 

1) At the beginning of the trial, the birds are weighed and allocated to the treatments in a way that ensures no significant differences in BW at day 1. So at least the BW variable should be okay. There could be an inherent feed intake effect at baseline, but they're also genetically homogenous so barring any strange circumstance, they should be identical on that front too. 

2) The pens are indeed unique, i.e. Block 1 has pens 1-8, Block 2 has pens 9-16, etc. I think the block variable is there for when studies have non-unique identifiers for pens, but that's not the case in this particular study. 

 

If I were to include a random intercept, would that be: 

 

RANDOM int DAY/ SUBJECT=PEN TYPE = unr residual;

 

or just 

 

RANDOM int / SUBJECT=PEN TYPE = unr residual;

 

?

Mike_N
SAS Employee

You can specify the random effects either way; a single RANDOM statement that contains both int and day, or with a separate RANDOM statement for each effect. 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 610 views
  • 8 likes
  • 2 in conversation