BookmarkSubscribeRSS Feed
tadgerviloria
Calcite | Level 5

Dear SAS community, 

 

I am interested to understand how I can use PROC MIXED to fit an ANOVA repeated measures with unbalanced data in a realiable way. I found several posts 

  1.  Solved: PROC MIXED vs. ANOVA - SAS Support Communities
    1. No codes are provided, but discuss pros and cons of using both procedure. 
  2. Microsoft Word - A beginner's example of PROC MIXED- Sarah R Greene.doc (lexjansen.com)
    1. Transposing data (wide format) and use of RANDOM statment
    2. CODE:
      1. Proc MIXED DATA=mydata.alldata_analysis1;
        CLASS word_type word_length subject;
        MODEL rt= word_type word_length word_type*word_length ddfm=bw;
        RANDOM intercept /sub=subject type=un;
        LSMEANS word_type*word_length;
        run;

When I am fitting an ANOVA model in PROC MIXED, I hope to see similar behaviour than ANOVA (such as GLM) with respect the unbalanced observations (not being included in the model). However, when I used the PROC mixed unbalanced observations are used. 

thanks in advance 

Kind regards

Philippe 

6 REPLIES 6
PaigeMiller
Diamond | Level 26

Do NOT use PROC ANOVA for unbalanced data. It is my understanding that both PROC MIXED and PROC GLM handle unbalanced data properly, and the complete unbalanced data is used in the analysis. I don't know what it means to say the unbalanced observations are not used in the model -- the concept doesn't even make sense to me.

--
Paige Miller
tadgerviloria
Calcite | Level 5

Dear Paige 

Many thanks for your prompt reponse. 

My understanding is that balanced and unbalanced data is a term equivalent for complete/uncomplete cases.

 

"A repeated measures ANOVA requires a balanced number of repeated measurements for each experimental unit. Due to this requirement, experimental units with missing measurements are completely excluded from the analysis" (Guidelines for repeated measures statistical analysis approaches with basic science research conside...

 

When I said the "the unbalanced observations [are] not being included in the model" is a similar statement than previous guideline paper saying  "[the] missing measurements are completely excluded from the analysis". This is a behaviour that I would expect to see in any Anova model process. I wonder if this could be done in PROC MIXED, since the way I am fitting the model is keeping the unbalanced observation (uncomplete cases) in the model. Please see below the used code:

 

proc mixed data = DATA_ANOVA;
  class ID  TRT VISIT ;
  model chg = TRT TRT*VISIT / solution cl;
  repeated / subject=ID type =   AR(1);
run;

 

Thanks in advance

Philippe

 

 

 

StatsMan
SAS Super FREQ

Balanced/Unbalanced data refers to the counts in each cell of your design (same number of observations per treatment group). PROC ANOVA requires balanced data in the design, PROC GLM and PROC MIXED do not.

The data situation you describe is slightly different. Subjects with incomplete repeated measures are not included in PROC GLM. The method of moments used in GLM requires complete data for each subject. Subjects with incomplete data are used in PROC MIXED. Maximum likelihood methods do not require that subjects have observations for all time points. MIXED does allow only one observation per time point for a subject. So, GLM and MIXED will not agree if you have incomplete data on your subjects in a repeated measures analysis. 

tadgerviloria
Calcite | Level 5

Dear StatsMan

 

Many thanks for your response!
Indeed, that it is my experience, PROC MIXED keeps the uncomplete cases in an Anova respeated measures, but PROC GLM is not. It is nice to know the reason behind (MLE in PROC MIXED, and MoM in PROC GLM) the discrepancy. 

 

When you said "PROC ANOVA requires balanced data in the design, PROC GLM and PROC MIXED do not." are you refering to a Pre-post design (and no repeated measures in the middle) ? Why the diferences between  PROC GLM and PROC ANOVA in this context?

 

Do you know what is PROC ANOVA expected to do in an Anova repeated measures when uncomplete cases are observed?

 

Thanks in advance

Philippe

 

StatsMan
SAS Super FREQ

It is best just to avoid PROC ANOVA. It was meant as a procedure for textbook-type problems. The method behind PROC ANOVA requires balanced data and there is no way to work around that requirement. Use GLM for your modeling needs that use only fixed effects. For models with random effects and/or repeated measures, use PROC MIXED. 

ballardw
Super User

From the on-line help for Proc Anova in the Overview section at the start:

Overview: ANOVA Procedure

 

The ANOVA procedure performs analysis of variance (ANOVA) for balanced data from a wide variety of experimental designs. In analysis of variance, a continuous response variable, known as a dependent variable, is measured under experimental conditions identified by classification variables, known as independent variables.   The variation in the response is assumed to be due to effects in the classification, with random error accounting for the remaining variation.

Emphasis added.

Traditional ANOVA in my personal opinion had exactly one advantage: traditional calculations could be done by hand. ( And I've done them that way because the computers I had available in the 1970's didn't have appropriate software).

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1064 views
  • 3 likes
  • 4 in conversation