Hello. I am seeking help with specifying the right model to analyzed data from a cluster-randomized control trial. I apologize for the long message, but I am trying to be as clear as possible. I expect an overall improvement over time (t1 vs t2), but stronger improvement in the treatment versus the control condition. In other words, I expect an interaction between condition and time. The dependent variable is measured in three different ways, that is, using three different measures - they are all on the same scale. I expect the same interaction effect mentioned above, for all three measures. In addition, I have two individual differences variables: personality1 and personality2, which I want to include as full factors. The complication is that the random assignment to one of the the conditions was not done at the subject level, but rather at site level. The variable SITE has 23 sites with a roughly, although not identical N of subj in each (circa 30 in each). 13 sites were assigned to the treatment, and 10 to the control condition. Finally, the site were in different areas, so I have an additional factor called AREA, which I intend to treat as random. To recap: Cond: Treat vs Ctrl Time: T1 vs T2 (repeated) Measure: a, b, c, d (repeated) subj = unique identifier for each participant P1 and P2 = continuous predictors DV = continuous variable This is the basic model that I am thinking of using: proc mixed data=dataset1; class subj cond time measure site area; model DV = cond|time|measure|P1|P2; random intercept /subject=area; random intercept /subject=site(area); /since site are nested within area*/ repeated / subject=subj type=cs; My questions : 1. Are the two random statements enough to account for the fact that the random assignment is done at the site level, instead of being at the subj level? 2. Is the repeated statement correct, or do I have to add the two repeated variables? If I have to add the repeated factors in the repeated statement, I gather from reading some documentation and several articles that have used somewhat similar designs, that i have a few options. If I want to use "compound symmetry" as covariance structure, I cannot simply specify two repeated factors - it does not work. But I seem to have two options: Option 1. put one repeated factor in the group option of the repeated statement. In my case I would thus use this: proc mixed data=dataset1; class subj cond time measure site area; model DV = cond|time|measure|P1|P2; random intercept /subject=area; random intercept /subject=site(area); /since site are nested within area*/ repeated Time / Subject = id*measure Group = measure Type = CS R Rcorr; Using the above syntax however, I get different results depending on which of the two factors (time or measure) I indicate in the group= . I do not understand why. Option 2. Create a new class variable in my data set, which combines time and measure data new_dataset1; set data=dataset1; time_measure = cats(time, '_', measure); /* Combine time and measure into a single factor */ run; proc mixed data=new_dataset1; class subj cond time measure site area time_measure; model DV = cond|time|measure|P1|P2; random intercept /subject=area; random intercept /subject=site(area); /since site are nested within area*/ repeated time_measure / subject=subj type=cs; Option 3. A third option entails changing the covariance structure proc mixed data=dataset1; class subj cond time measure site area ; model DV = cond|time|measure|P1|P2; random intercept /subject=area; random intercept /subject=site(area); /since site are nested within area*/ repeated time measure/ subject=subj type = UN@UN ; Option 2 (repeated time_measure / subject=subj type=cs) and the original model (repeated / subject=subj type=cs) give the exact same results, but Option 1 (repeated Time / Subject = id*measure Group = measure Type = CS R Rcorr) and Option 3 (repeated time measure/ subject=subj type = UN@UN 😉 give different results from each other and from both Option 2 and the original model. For reasons that have to do with previous research, I would prefer to use compound symmetry as covariance structure, thus the original model, if valid, or option 1 or 2. Are they equally valid?
... View more