BookmarkSubscribeRSS Feed
ninemileshigh
Obsidian | Level 7

My use of a mixed effects model has been queried for a repeated measures analysis. I am working in SAS 9.4. In my experimental set up I have a randomised block design, with 5 blocks each containing 1 replicate of 3 levels of a treatment (5 * 3 , n= 15)- variable name "TREAT". Measurements of my reponse variable (LOG_N2O) were taken on 75 occasions (CYCLETIME), hence the need for a repeated measures analysis, but since there are missing data from nearly every measurement time, ANOVA is not suitable. The code I have used is below

 

PROC MIXED DATA= AUTOSTATS;
CLASS BLOCK TREAT CYCLETIME;
MODEL LOG_N2O= TREAT CYCLETIME BLOCK
TREAT*CYCLETIME
TREAT*BLOCK
CYCLETIME*BLOCK;
REPEATED CYCLETIME / SUBJECT= CHANNEL TYPE= AR(1);
RANDOM CHANNEL BLOCK;
LSMEANS TREAT CYCLETIME BLOCK /* POST HOC TEST FOR IDENTIFYING THE DIFFERENCES */
TREAT*CYCLETIME
TREAT*BLOCK
CYCLETIME*BLOCK
/ ADJUST= TUKEY ;
ODS EXCLUDE CONVERGENCESTATUS; /*SUPPRESS THE PRINTING AS IT TAKES UP MEMORY*/
ODS EXCLUDE COVPARMS;
ODS EXCLUDE DIMENSIONS;
ODS EXCLUDE FITSTATISTICS;
ODS EXCLUDE ITERHISTORY;
ODS EXCLUDE LRT;
ODS EXCLUDE MODELINFO;
ODS EXCLUDE NOOBS;
ODS OUTPUT DIFFS= MEANTREAT TESTS3= N2O_FSTATS;
RUN;

 

 

 

This gives me output that is understandable and makes sense to me...

 

SAS Output

Class Level Information
Class Levels Values
BLOCK 5 1 2 3 4 5
TREAT 3 FER NH4 NO3
CYCLETIME 75 01APR2014:03:41:50 01APR2014:09:34:55 01APR2014:23:12:41 02APR2014:12:34:39 02APR2014:16:42:48 02APR2014:20:51:37 03APR2014:00:59:42 03APR2014:05:08:41 03APR2014:09:17:52 03APR2014:13:26:31 03APR2014:17:36:16 03APR2014:21:44:46 04APR2014:01:52:23 04APR2014:05:59:54 04APR2014:10:09:08 04APR2014:16:07:16 04APR2014:20:14:46 05APR2014:00:23:31 05APR2014:04:33:54 05APR2014:08:42:11 05APR2014:12:49:13 05APR2014:16:55:41 05APR2014:21:02:37 06APR2014:01:10:53 06APR2014:05:20:49 06APR2014:12:56:13 07APR2014:15:17:47 09APR2014:12:27:32 09APR2014:16:36:09 09APR2014:20:47:03 10APR2014:00:59:10 10APR2014:05:10:27 10APR2014:09:19:44 10APR2014:20:50:00 11APR2014:01:02:00 11APR2014:18:46:00 11APR2014:22:57:00 12APR2014:03:08:00 12APR2014:07:17:00 12APR2014:11:26:00 12APR2014:15:34:00 13APR2014:16:09:00 13APR2014:20:19:00 14APR2014:00:28:00 24MAR2014:17:41:36 24MAR2014:22:03:01 25MAR2014:02:38:52 25MAR2014:07:14:45 25MAR2014:15:52:05 25MAR2014:20:13:41 26MAR2014:00:49:28 26MAR2014:05:25:17 26MAR2014:10:01:16 26MAR2014:16:41:00 26MAR2014:21:16:41 27MAR2014:01:52:21 27MAR2014:06:27:58 27MAR2014:15:40:36 27MAR2014:20:02:04 28MAR2014:00:37:53 28MAR2014:05:13:38 28MAR2014:09:49:36 28MAR2014:14:25:30 29MAR2014:19:23:01 29MAR2014:23:57:56 30MAR2014:04:32:23 30MAR2014:09:07:49 30MAR2014:13:43:31 30MAR2014:18:18:59 30MAR2014:22:53:40 31MAR2014:03:28:38 31MAR2014:08:04:00 31MAR2014:13:32:48 31MAR2014:18:26:10 31MAR2014:23:04:18

Number of Observations
Number of Observations Read 1110
Number of Observations Used 794
Number of Observations Not Used 316

Type 3 Tests of Fixed Effects
Effect Num DF Den DF F Value Pr > F
TREAT 2 356 9.76 <.0001
CYCLETIME 61 356 27.18 <.0001
BLOCK 4 0 29.90 .
TREAT*CYCLETIME 122 356 1.35 0.0176
BLOCK*TREAT 8 356 18.85 <.0001
BLOCK*CYCLETIME 240 356 1.03 0.3987

 

However, the degrees of freedom provided in the SAS output have been questioned and I need to be certain of my analysis as it is for publication. To solve the issue, I then went to SPSS to run the analysis there.... and opened another can of worms, since the output of the model is so different, in that the F value for the effect of variable "TREAT" is much much lower in SPSS, the degrees of freedom are also much much lower and hence the p value now non-sig. It's a long time since I used SPSS so I am less inclined to trust my coding in that platform. My question is whether the model I have coded above is a valid way to analyse my data, and whether the degrees of freedom are correct (why they are what they are, so that I can justify my analysis)

7 REPLIES 7
Miracle
Barite | Level 11
It could be due to the difference in method for computing the denominator degrees of freedom between both SAS and SPSS. As far as I know, the default is CONTAINment for SAS and Satterthwaite for SPSS. HTH.
sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

What is CHANNEL, and what role does it play in your study? The term is in your model but not in the description of your study (which otherwise is quite helpful).

 

What is it that you want to know about the effect of CYCLETIME? 75 levels is a lot to sort out in an ANOVA model. (Your current code specifies a mixed-model ANOVA.)

 

ninemileshigh
Obsidian | Level 7

Hi, 

 

CHANNEL is the unique id of each replicate. It is a numeric label. 

 

My interest is primarily whether there is a treatment effect (TREAT), but also whether that changes with time (CYCLETIME) and if gthere is an interaction between the two. 

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

So you have 15 "replicates", 3 in each of 5 blocks?

 

If you had balanced data, then the interaction of TREAT and CYCLETIME would have 3 x 75 = 225 means. How would you go about interpreting interaction in this scenario? Would it be more parsimonious to regress the response on CYCLETIME and assess whether the 3 regressions are parallel across CYCLETIME?

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

Good for your critic. You should definitely question the correctness of the model when it reports zero denominator degrees of freedom, no matter how much you like the results. Generally zero den df means that your model is over-specified.

 

And, in fact, your model is overspecified: you have BLOCK in both MODEL and RANDOM statements, which is not correct--you are asking the model to estimate the same parameters twice which makes the model unhappy. Theoretically, if BLOCK is a random effects factor, BLOCK and all interactions involving BLOCK are random effects factors. Consequently those terms generally should not be in the MODEL statement.

 

If you have at least one observation in each combination of TREAT and CYCLETIME, then you could consider this model:

 

proc glimmix data=autostats;
  class block treat cycletime;
  model log_N2O = treat|cycletime;
  random block;
  *random block*treat; /* might be included for type=ar(1) */
  repeated cycletime / subject=block*treat type=ar(1);
lsmeans treat|cycletime; run;

Note that "channel" in your model is (I think) equivalent to BLOCK*TREAT. You don't need to use CHANNEL--BLOCK*TREAT works just fine--but if you do, it should be in the CLASS statement.

 

If the TREAT x CYCLETIME factorial is incomplete and if TREAT*CYCLETIME is included in the treatment structure of the model, then various lsmeans will be reported as "non-est" (not estimable) and the interpretation of the interaction may be (probably is) suspect.

 

A fixed effects factor with 75 levels (like CYCLETIME) is very unwieldy in an ANOVA model. If it is significant, how do you sort out which means are different from which other means? The number of pairwise comparisons is 2,775 (the combination of 75 things taken 2 at a time). 2,775!  Consider the impact on power of controlling for family-wise Type I error among 2,775 comparisons using the Tukey method. Plus I can't imagine that you actually care about each of the 75 means. Plus the interaction is even more unwieldy.

 

So I suggest giving serious consideration to dealing with CYCLETIME in some other way that makes sense in the context of your data. You could, for example, compute weekly averages (if that was defensible in context) and continue with a mixed-model ANOVA approach. Another approach is a regression of the response on CYCLETIME, maybe using spline regression rather than linear or curvilinear regression if the linearity assumption is not met. A regression approach would also accommodate your missing data problem. These sorts of models are very complex and definitely should not be pursued naively. (I've added italics for emphasis. You really need to know what you are doing with these models.)

 

To use CYCLETIME as the predictor variable in regression, you would need to convert it from a text variable to a numeric variable.

 

A few resources related to a regression approach:

 

Chapters 7 and 8 in https://www.sas.com/store/books/categories/usage-and-reference/sas-for-mixed-models-second-edition/p...

 

https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_glimmix_sec...

 

http://blogs.sas.com/content/iml/2017/04/19/restricted-cubic-splines-sas.html

 

Good luck!

 

 

jispar
Calcite | Level 5

Hello,

 

Can you use Repeated in Proc Glimmix?   I tried your code, but I got error message saying "ERROR 180-322: Statement is not valid or it is used out of proper order."  I thought we use "random _residual_" for the repeated measures in Proc Glimmix?  Is it just my SAS program?   I have v9.4.

 

Thanks!

JP

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

My mistake, my apologies.

 

You could replace "glimmix" with "mixed".

 

Or you could replace the repeated statement with

random cycletime / subject=block*treat type=ar(1) residual;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 3106 views
  • 3 likes
  • 4 in conversation