Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Proc GLM vs. Proc MIXED: how to specify error term

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 11-19-2018 12:00 PM
(1377 views)

Imagine a ring trial with 10 samples, each tested by a lab. There are 30 laboratories using one of 3 test kits, thus the head of a data set might look like attached.

Analysing the differences in the main effect of testkit can be done in proc glm via

```
PROC GLM DATA=dataset;
class sample test lab;
model y = sample test lab(test) sample*test;
test H=test E=lab(test);
```

/* see if test is different, when assuming that testsets are specific to the lab (error term) */
lsmeans sample;
lsmeans test/ E=lab(test) tdiff pdiff;
RUN;

How can I check for an overall effect of the testkits AND an effect of testkit assuming another error structure in proc mixed?

At the moment I need two codes, first for the overall effect:

```
proc mixed data=dataset;
class sample test lab;
model y = sample test lab(test) sample*test;
run;
```

second for redefining the random component / error structure:

```
proc mixed data=dataset covtest;
class sample test lab;
model y = sample test sample*test / ddfm=BETWITHIN;
random lab lab*test;
run;
```

In the latter I have to specify the ddfm, otherwise no DF can be calculated. For the covtest results, the test*lab interaction has an estimate of 0 (?), specifying nobound does not alter results. Writing method=type3 does not give any results.

Am I on the right path and can I combine the two outputs in proc mixed to see that there is a significant interaction for lab*test but no difference in testkits overall?

I am reading SAS FOR LINEAR MODELS (Littell), but it seems like I am missing something. Working in SAS 9.4. on Windows.

6 REPLIES 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Sorry, I didn't want to inflate the spreadsheet too much and just wrote down the head of the data as an example. Please see the full structure attached. These are 39 laboratories x 10 samples, i.e. 390 observations. From the labs, 4 use test A, 13 test B and 22 test C. y is continuous from approx. -10 to + 200.

Only note in the log is: Estimated G matrix is not positive definite. (for the second proc mixed.)

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks for the additional information. The message that the G matrix is not positive definite is an indication that there is a problem with the random effects you are trying to fit. Your RANDOM statement has LAB*TEST. However, each of your labs only saw one level of test. It will be difficult to estimate that interaction when your design did not account for the interaction effect. You would ideally need each lab to perform all 3 tests if you wanted to measure the LAB*TEST effect. The best you can do now is to drop the LAB*TEST effect from the RANDOM statement. Since you want to use DDFM=BW, I would also change your RANDOM statement to

random int / subject=lab;

so that MIXED knows how to break up the between- and within- subject effects. Without a SUBJECT= effect on a RANDOM statement, BW will assign all effects the residual df.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you very much for your quick reply. The latter does makes sense and the degrees of freedom are now displayed correctly.

Maybe I repeat myself, but is not the test just nested in the lab? Like the classroom in each school. Why is one random statement sufficient and not two:

random int / subject=lab;

random int / subject=test(lab);

Sure, again I would have the notes:

NOTE: Convergence criteria met but final Hessian is not positive definite.

NOTE: Estimated G matrix is not positive definite.

Just for my understanding, I don't see it yet... The test is the same but can be applied differently in each lab (which I expect). Example: Even if I have the same sample and it is examined in two labs using the same test kit, the measurements will differ (possibly).

Besides, the samples are considered independent, i.e. as 390 samples and not just 10?

Maybe I repeat myself, but is not the test just nested in the lab? Like the classroom in each school. Why is one random statement sufficient and not two:

random int / subject=lab;

random int / subject=test(lab);

Sure, again I would have the notes:

NOTE: Convergence criteria met but final Hessian is not positive definite.

NOTE: Estimated G matrix is not positive definite.

Just for my understanding, I don't see it yet... The test is the same but can be applied differently in each lab (which I expect). Example: Even if I have the same sample and it is examined in two labs using the same test kit, the measurements will differ (possibly).

Besides, the samples are considered independent, i.e. as 390 samples and not just 10?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Are the samples the same across the labs? For example, is sample 1 in lab 1 the same as sample 1 in lab 2? or are the samples just reps in your experiment? If they samples just represent replications within a lab (ie, sample 1 in lab 1 is not the same sample as sample 1 in lab 2) then you do not want sample as an effect in your model. TEST would be the only effect on the MODEL statement in that case.

If each lab sees a set of the tests, then you can model lab*test as a random effect. If not, then you can only model lab as random. The data just will not support a more complicated variance structure.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The 10 samples are the same for every lab, i.e. sample 1 in lab 1 equals sample 1 in lab 2. Exactly that's my question: Personally, I would model test(lab) in a second random term, but according to your first answer, that's not possible?

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.