Solved: Analysing data when covariates are limited to 2/3 treatment groups

LL25 · Posted 07-25-2023 04:09 AM

Hi!

I am running statistical procedures (proc mixed/glimmix) on repeated measurements (Time = 1 - 5) of blood metabolites in calves, I have three treatment groups (Treatment = A, B, C). Repeated measurements for "C" are only available for Time 3 to Time 5 (and unavailable for Time 1 and 2).
I am assessing differences between treatment groups for every time-point, and am also assessing changes over time within each treatment group, Interactions of treatment*time do not converge correctly due to missing values for "Treatment C" at Time 1 and Time 2. Therefore I have so far had to separate the analyses and assess for differences between treatment groups for every timepoint separately.

My current model is:
proc mixed data=BLOOD plots=none; by Time;
class calf treatment breed sex origin ;
model Glucose = treatment age breed sex origin/ ddfm=kr residual outp=predresid;
lsmeans treatment/ adjust=Tukey;
run;

I am dealing with a base of covariates, in particular age (continuous), breed (1, 2), origin (1, 2), and sex (M, F). The problem is that differences in covariates breed, origin, and sex are confined to Treatments A and B, whereas all calves are of similar breed, origin, and sex in Treatment C.
Due to the limited repeated measures in treatment C, and non-applicable covariates, it has been suggested to me that I need to split up this analysis. Is it possible to account for covariates in treatments A and B by running an initial model (y = breed + origin + sex), and using estimated values from the ods output in a second model where all treatment groups are included?

Example model 1:
proc mixed data=BLOOD plots=none; by time; where treatment < 3;
class calf origin breed sex;
model Glucose = origin breed sex / ddfm=kr residual outp=predresid;
random calf;
run;

To then run the following model including all treatment groups:
proc mixed data=BLOOD plots=none; by sample;
class calf treatment ;
model Glucose = treatment age / ddfm=kr residual outp=predresid;
lsmeans treat/ adjust=Tukey;
run;

I’m not sure if this method accounts for the covariance correctly, as the main treatment effect and age are not included in the initial model. But I also don’t know if the very first model will run correctly if three of the covariates are restricted to two of the three treatment groups. Would anyone have any advice on how to account for these unbalanced covariates?

SteveDenham · Posted 07-25-2023 08:55 AM

Compositing the variables could help. Main effects and interactions could then be tested/estimated using LSMESTIMATE statements. In particular, I think you probably need to accommodate any correlation of the residuals over time. Consider the following:

proc mixed data=BLOOD plots=none; 
class calf time treatment breed sex origin ;
model Glucose =age treatment*time*breed*sex*origin/e;
repeated time/subject=calf type=cs;
store out=stored_1;
run;

proc plm restore=stored_1;
lsmestimate <this will depend on the coefficients from the /e option in the model statement in proc mixed above>
<also this is where any adjustments for multiplicity such as adjust=tukey or adjust=kr2 would be incorporated>;
run;

But for now, just getting PROC MIXED to run with a repeated structure should be considered a win.

For more background on this approach, look in Milliken and Johnson's Analysis of Messy Data, vol. 1 for analyses that use a "means model".

SteveDenham

View solution in original post

Ksharp · Posted 07-25-2023 07:41 AM

From your code, that would be GLM ,not MIXED model ,I would not advocate it .
I think you'd better post your question at Statistical Forum:
https://communities.sas.com/t5/Statistical-Procedures/bd-p/statistical_procedures

And @StatDave @lvm @SteveDenham would give you a good idea.

SteveDenham · Posted 07-25-2023 08:55 AM

Compositing the variables could help. Main effects and interactions could then be tested/estimated using LSMESTIMATE statements. In particular, I think you probably need to accommodate any correlation of the residuals over time. Consider the following:

proc mixed data=BLOOD plots=none; 
class calf time treatment breed sex origin ;
model Glucose =age treatment*time*breed*sex*origin/e;
repeated time/subject=calf type=cs;
store out=stored_1;
run;

proc plm restore=stored_1;
lsmestimate <this will depend on the coefficients from the /e option in the model statement in proc mixed above>
<also this is where any adjustments for multiplicity such as adjust=tukey or adjust=kr2 would be incorporated>;
run;

But for now, just getting PROC MIXED to run with a repeated structure should be considered a win.

For more background on this approach, look in Milliken and Johnson's Analysis of Messy Data, vol. 1 for analyses that use a "means model".

SteveDenham

LL25 · Posted 07-26-2023 06:28 AM

Thank you for your help! I will work on getting the repeated structure working and include all covariates in the model then.

SteveDenham · Posted 07-26-2023 07:56 AM

A PROC SUMMARY or MEANS involving all covariates would be a good check for empty cells due to cross tabulating, so maybe run that before trying to get the mixed model running.

SteveDenham

Analysing data when covariates are limited to 2/3 treatment groups

Re: Analysing data when covariates are limited to 2/3 treatment groups

Re: Analysing data when covariates are limited to 2/3 treatment groups

Re: Analysing data when covariates are limited to 2/3 treatment groups

Re: Analysing data when covariates are limited to 2/3 treatment groups

Re: Analysing data when covariates are limited to 2/3 treatment groups

Analysing data when covariates are limited to 2/3 treatment groups

Re: Analysing data when covariates are limited to 2/3 treatment groups

Re: Analysing data when covariates are limited to 2/3 treatment groups

Re: Analysing data when covariates are limited to 2/3 treatment groups

Re: Analysing data when covariates are limited to 2/3 treatment groups

Re: Analysing data when covariates are limited to 2/3 treatment groups

Registration is open