Solved
Contributor
Posts: 46

# difference between "subject = ID" and "subject = ID(trt)

1.  What is the difference, when using PROC Mixed, between specifying subject = ID vs. =ID(trt)?

For example: 40 animals are block by BW (so that each treatment has equal starting BW) and randomly assign to 1 of 4 treatments.

I would think you would use ID(TRT), but when would you use just "ID"?

2. Another related example: We sampled & compared nutrients of 4 different tree species from 4 different "Sites."  There is only 1 tree species per Site. Within each site, we sampled trees (of 1 species) only once, from 4 different plots; experimental unit is Plot (n=4).  Basically, species = site.

Thus, input data looks like:

SITE     PLOT     SPECIES     Protein concentration, etc...

1               1                A

1               2                A

1               3                A

1               4                A

2               1                B

2               2                B

2               3                B

2               4                B

3               1                C

3               2                C

3               3                C

3               4                C

4               1                D

4               2                D

4               3                D

4               4                D

Questions: Should I number plot 1, 2, 3, 4, 5, 6, ... to 16 vs. 1 to 4 or does it matter?

Is the following random statement correct (it seems to me that the random error plot, needs to be within species, correct?

PROC MIXED;

CLASS site plot species;

MODEL Protein = species;

Random plot(species);  OR "Random plot"

Many thanks in advance for your time. This is a great site for making sure I'm doing things

correctly, vs. assuming...  I need to start sending out gift cards for all the great replies.

Accepted Solutions
Solution
‎03-11-2013 07:47 AM
Posts: 2,655

## Re: difference between "subject = ID" and "subject = ID(trt)

Q1: If every animal has a unique ID, the two statements are equivalent.  The nested/crossed version is only needed if, for instance, the animals within each treatment were numbered similarly.

Q2: Either numbering system will work, but this is where the nested/crossed version starts to become important.  I notice that site and species are completely confounded, so my preference would be to number 1 to 16.  Without replication, though, there is no way to estimate a random effect, so the following is all that you would need:

PROC MIXED;

CLASS species;

MODEL Protein = species;

run;

Steve Denham

All Replies
Solution
‎03-11-2013 07:47 AM
Posts: 2,655

## Re: difference between "subject = ID" and "subject = ID(trt)

Q1: If every animal has a unique ID, the two statements are equivalent.  The nested/crossed version is only needed if, for instance, the animals within each treatment were numbered similarly.

Q2: Either numbering system will work, but this is where the nested/crossed version starts to become important.  I notice that site and species are completely confounded, so my preference would be to number 1 to 16.  Without replication, though, there is no way to estimate a random effect, so the following is all that you would need:

PROC MIXED;

CLASS species;

MODEL Protein = species;

run;

Steve Denham

Contributor
Posts: 46

## Re: difference between "subject = ID" and "subject = ID(trt)

Hey Steve. Man... you are a great resource on this discussion board.

For the 2nd question, related to the species, basically species = site.  Within each site, we replicated using 4 separate plots (Plot is experimental unit).

So I think (given your answer to #1) that we need to number the plots 1 to 16 and use "Random Plot", correct?

Or leave the numbering 1 to 4 (within plot) and use random plot(site).

Thoughts?

Posts: 2,655

## Re: difference between "subject = ID" and "subject = ID(trt)

It shouldn't make any difference.  For this design, if you specify plot as an effect (either way), it will simply be the same as not specifying it at all!!  The 'plot' effect is the residual variability, so it isn't needed.  If you had replication, or wished to consider the various IV's as some sort of repeated/multivariate measure, then plot would need to be specified.

Steve Denham

🔒 This topic is solved and locked.