turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- multivariate analysis using proc mixed, issues wi...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-16-2016 03:35 PM

Good afternoon,

I am trying to run a multivariate analysis using proc mixed as part of my dissertation work. The purpose of this analysis is to determine if temperature (temp) effects several metabolic parameters (var) for fish. Each individual is identified in the dataset (number) and a condition factor (k) is also included.

I have managed to code a working model but when I look at the LS Means output, things do not make sense. I have 5 responses that are on different scales and while the estimates for the LSMs look reasonable, the standard errors are very similar across all responses and the DFs are incredibly high (168 when I sampled 48 individuals).

I have tried finding help online but have not been successful. I'm not sure if the issue with the analysis is something in the code or in the way the data is set up. Any thoughts or suggestions are appreciated. I've pasted the model code here and attached some of the data as well.

Thanks,

Ben

proc mixed data=mva.data3(where=(species='spot')) method=reml order=data;

class var number temp;

model y = var var*temp var*k

/ noint solution outp=mva.spot_out;

random number;

lsmeans var*temp / CL; *order is var then temp;

run;

Accepted Solutions

Solution

06-17-2016
09:38 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-17-2016 08:58 AM

The key here is that the variables coded in 'var' are repeated measurements on each individual. You'll want to accommodate the correlation between them, if possible. Consider the following:

```
proc mixed data=mva.data3(where=(species='spot')) method=reml order=data;
class var number temp;
model y = var var*temp var*k
/ noint solution outp=mva.spot_out ddfm=kr;
REPEATED var/type=un subject=number;
*random number;
lsmeans var*temp / CL; *order is var then temp;
run;
```

I used the unstructured covariance type here, as I have no prior knowledge about possible correlations. For your work, it may be more interpretable to use the UNR structure, and get the correlations between the variables, rather than the covariances. I also changed the denominator degrees of freedom to the Kenward-Rogers method, as it provides better characteristics than the default between-within, and should end up with logical degrees of freedom for the lsmeans.

Steve Denham

All Replies

Solution

06-17-2016
09:38 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-17-2016 08:58 AM

The key here is that the variables coded in 'var' are repeated measurements on each individual. You'll want to accommodate the correlation between them, if possible. Consider the following:

```
proc mixed data=mva.data3(where=(species='spot')) method=reml order=data;
class var number temp;
model y = var var*temp var*k
/ noint solution outp=mva.spot_out ddfm=kr;
REPEATED var/type=un subject=number;
*random number;
lsmeans var*temp / CL; *order is var then temp;
run;
```

I used the unstructured covariance type here, as I have no prior knowledge about possible correlations. For your work, it may be more interpretable to use the UNR structure, and get the correlations between the variables, rather than the covariances. I also changed the denominator degrees of freedom to the Kenward-Rogers method, as it provides better characteristics than the default between-within, and should end up with logical degrees of freedom for the lsmeans.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-17-2016 09:38 AM

Steve,

Thanks for the response. That solves the problem with the standard errors and dfs. I am a little confused though. This doesn't seem to be a repeated measures design in the sense that I understand it (multiple measurements of the same response) so I'm not sure why I need to use the repeated statement rather than the random statement. Any insight or explanation you have would be greatly appreciated.

Thanks again,

Ben

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-17-2016 10:02 AM

Hi Ben,

Buried in the MIXED documentation is a section under TYPE= that outlines Kronecker products, with an example of height and weight measured over time. It was where I first thought of going, but you had only a single measure. Well, that corresponds to one block of the Kronecker product matrix.

In your case, you have multiple measures on a single subject. Whether they are spaced in time, location or some other dimension that describes the relationship among metabolic parameters makes little difference--they are still multiple measures on a subject.

An interesting thing would be to compare the results with the variables as correlated, but as G-side rather than R-side (my approach). To do this, try:

proc mixed data=mva.data3(where=(species='spot')) method=reml order=data; class var number temp; model y = var var*temp var*k / noint solution outp=mva.spot_out ddfm=kr; RANDOM var/type=un subject=number; lsmeans var*temp / CL; *order is var then temp; run;

The first thing that I might expect here is that now the degrees of freedom might not be as expected. It might not converge, as var is now both a fixed and random effect (not that is a bad thing in and of itself, it just means something like a fixed slope and random intercept, or vice versa).

But in the end, it comes down to multiple measurements on a single subject. It's why you can use spatial covariance structures for data that are unequally spaced in time for traditional repeated measures analyses.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-17-2016 02:03 PM

Thanks Steve,

I'll look into this, though from a preliminary run it looks like I get the same output from the model whether I use the repeated statement or the random statement.

Thanks for the help,

Ben