BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bmarc
Fluorite | Level 6

Good afternoon,

I am trying to run a multivariate analysis using proc mixed as part of my dissertation work. The purpose of this analysis is to determine if temperature (temp) effects several metabolic parameters (var) for fish. Each individual is identified in the dataset (number) and a condition factor (k) is also included.

I have managed to code a working model but when I look at the LS Means output, things do not make sense. I have 5 responses that are on different scales and while the estimates for the LSMs look reasonable, the standard errors are very similar across all responses and the DFs are incredibly high (168 when I sampled 48 individuals).

I have tried finding help online but have not been successful. I'm not sure if the issue with the analysis is something in the code or in the way the data is set up. Any thoughts or suggestions are appreciated. I've pasted the model code here and attached some of the data as well.

Thanks,

Ben

 

proc mixed data=mva.data3(where=(species='spot')) method=reml order=data;
  class var number temp;
  model y = var var*temp var*k
        / noint solution outp=mva.spot_out;
  random number;
  lsmeans var*temp / CL; *order is var then temp;
  run;

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

The key here is that the variables coded in 'var' are repeated measurements on each individual.  You'll want to accommodate the correlation between them, if possible.  Consider the following:

 

proc mixed data=mva.data3(where=(species='spot')) method=reml order=data;
  class var number temp;
  model y = var var*temp var*k
        / noint solution outp=mva.spot_out  ddfm=kr;
REPEATED var/type=un subject=number;
  *random number;
  lsmeans var*temp / CL; *order is var then temp;
  run;

I used the unstructured covariance type here, as I have no prior knowledge about possible correlations.  For your work, it may be more interpretable to use the UNR structure, and get the correlations between the variables, rather than the covariances.  I also changed the denominator degrees of freedom to the Kenward-Rogers method, as it provides better characteristics than the default between-within, and should end up with logical degrees of freedom for the lsmeans.

 

Steve Denham

 

View solution in original post

4 REPLIES 4
SteveDenham
Jade | Level 19

The key here is that the variables coded in 'var' are repeated measurements on each individual.  You'll want to accommodate the correlation between them, if possible.  Consider the following:

 

proc mixed data=mva.data3(where=(species='spot')) method=reml order=data;
  class var number temp;
  model y = var var*temp var*k
        / noint solution outp=mva.spot_out  ddfm=kr;
REPEATED var/type=un subject=number;
  *random number;
  lsmeans var*temp / CL; *order is var then temp;
  run;

I used the unstructured covariance type here, as I have no prior knowledge about possible correlations.  For your work, it may be more interpretable to use the UNR structure, and get the correlations between the variables, rather than the covariances.  I also changed the denominator degrees of freedom to the Kenward-Rogers method, as it provides better characteristics than the default between-within, and should end up with logical degrees of freedom for the lsmeans.

 

Steve Denham

 

bmarc
Fluorite | Level 6

Steve,

Thanks for the response. That solves the problem with the standard errors and dfs. I am a little confused though. This doesn't seem to be a repeated measures design in the sense that I understand it (multiple measurements of the same response) so I'm not sure why I need to use the repeated statement rather than the random statement. Any insight or explanation you have would be greatly appreciated.

Thanks again,

Ben

SteveDenham
Jade | Level 19

Hi Ben,

 

Buried in the MIXED documentation is a section under TYPE= that outlines Kronecker products, with an example of height and weight measured over time.  It was where I first thought of going, but you had only a single measure.  Well, that corresponds to one block of the Kronecker product matrix.  

 

In your case, you have multiple measures on a single subject.  Whether they are spaced in time, location or some other dimension that describes the relationship among metabolic parameters makes little difference--they are still multiple measures on a subject.  

 

An interesting thing would be to compare the results with the variables as correlated, but as G-side rather than R-side (my approach).  To do this, try:

 

proc mixed data=mva.data3(where=(species='spot')) method=reml order=data;
  class var number temp;
  model y = var var*temp var*k
        / noint solution outp=mva.spot_out  ddfm=kr;
RANDOM var/type=un subject=number;
   lsmeans var*temp / CL; *order is var then temp;
  run;

The first thing that I might expect here is that now the degrees of freedom might not be as expected.  It might not converge, as var is now both a fixed and random effect (not that is a bad thing in and of itself, it just means something like a fixed slope and random intercept, or vice versa).

 

But in the end, it comes down to multiple measurements on a single subject.  It's why you can use spatial covariance structures for data that are unequally spaced in time for traditional repeated measures analyses.

 

Steve Denham

 

 

 

 

 

bmarc
Fluorite | Level 6

Thanks Steve,

I'll look into this, though from a preliminary run it looks like I get the same output from the model whether I use the repeated statement or the random statement.

Thanks for the help,

Ben

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1568 views
  • 0 likes
  • 2 in conversation