turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Estimates for split-plot with missing treatment co...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

09-03-2013 12:36 PM

Dear all,

I am working with an unbalanced split-split plot design (date as a split-block factor) with 4 factors; "nem" =3 levels, "trt"=2 levels, "cultivar"=3 levels and date=4 levels. The following missing treatment combination for the nem*cultivar interaction are present ("."):

**nem**

**cultivar**** ** 1 2 3

Nord x x .

Sang x . x

Beretta . x x

Everything looks OK when getting the results of my estimates, nevertheless I get just a few significant p-values for most of them, and I know from my data that I should expect more significant results. Could it be that the problem is due to the split-plot model, where the different factors do not share the same error term as in a CRD or RCBD? Should I determine an appropiate error term for any particular estimate?

Are maybe the many missing observations that I have yielding to non-valid results? In my estimates I want to compare the first date vs the last date (I have 4 dates in total, but for my first dependent variable all observations are missing for date 2, so that is why I am replacing date 4 with date 3 when comparing the first and the last date in the nonpositional syntax. In the positional syntax I omit date 2 when calculating the coefficients). Both syntax yield to the same results for all estimates, so this should be right.

Here is my code, with only 2 of the estimates;

Proc mixed data=one;

class blk nem trt cultivar date;

model lgsprCysts100=

nem

trt

trt*nem

cultivar

cultivar*nem

cultivar*trt

nem*trt*cultivar

date

nem*date

trt*date

nem*trt*date

cultivar*date

nem*cultivar*date

trt*cultivar*date

nem*trt*cultivar*date;

random blk*nem blk*nem*trt blk*nem*trt*cultivar blk*date blk*nem*date blk*trt*date blk*nem*trt*date blk*cultivar*date blk*nem*cultivar*date blk*trt*cultivar*date;

lsmestimate nem*trt*cultivar*date 'Positional dateHafer2010 vs. dateSG2012 when nem=2 trt=1 & cultivar=Beretta' 0 0 0 0 0 0 0 0 0 0 0 0 1 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0,

'Nonpositional NTCD2111 - NTCD2114' [1, 2 1 1 1][-1, 2 1 1 3],

'Positional dateHafer2010 vs. dateSG2012 when nem=2 trt=2 & cultivar=Beretta' 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0,

'Nonpositional NTCD2211 - NTCD2214' [1, 2 2 1 1][-1, 2 2 1 3],

'Positional dateHafer2010 vs. dateSG2012 when nem=2 trt=1 & cultivar=Nord' 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0,

'Nonpositional NTCD2121 - NTCD2124' [1, 2 1 2 1][-1, 2 1 2 3],

'Positional dateHafer2010 vs. dateSG2012 when nem=2 trt=2 & cultivar=Nord' 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 -1 0 0 0 0 0 0 0 0 0 0 0 0,

'Nonpositional NTCD2221 - NTCD2224' [1, 2 2 2 1][-1, 2 2 2 3],

run;

I would greatly appreciate your help!!!!

Thank you very much!!

Caroline

Accepted Solutions

Solution

09-04-2013
01:26 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to palolix

09-04-2013 01:26 PM

You have the greatest designs

The E option in PROC MIXED estimates and LSMESTIMATES gives the coefficients of the estimable functions **L**. So you don't specify the error term like you do in GLM.

I am going to worry more about the estimates. It's time to get the LSMEANS for nem*trt*cultivar*date, and check those values, and try calculating (by hand!) the LSMESTIMATEs you have. If you are getting values with the wrong sign, you may need to rethink the coefficients in your statements.

Steve Denham

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to palolix

09-04-2013 07:45 AM

Hi again Caroline,

The code looks fine, and should return everything with the proper error terms. Take a look at the lsmeans and standard errors--do they look appropriate? Your expectation of more significances than are being found is likely due to the size of the standard errors, and to the degrees of freedom associated with each estimate. If those are incorrect, you may need to apply one of the ddfm= options. It should be using the default CONTAIN option, but a hand calculation should be applied to check this.

The other option, if possible, would be to move 'date' to a repeated statement, but only if the subplots are measured at different time points. This may drastically change both the standard error estimates and the degrees of freedom.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

09-04-2013 09:11 AM

Thank you so much Steve, I am so glad that you can help me with this situation!!!

The estimates and standard errors looks odd, and the DF=4 for all the estimates. For example if I look at all estimates when nem=2, half of them are negative (which make sense to me, because I know that in the last year the number of disease eggs are much more compared to the first year), but for half of the estimates, the values are positive, which cannot be (because the disease eggs are higher in the last year). With respect to the standard errors, I only get 2 different standard errors for all estimates! For example, when nem=2, for different cultivars and treatments levels, the standard errors are the same (0.3188), and also when nem=3 (also 0.3188) !! You are right, the degrees of freedom method is CONTAINMENT.

What about the E=effect option at the end of the estimates? I think this is necessary when looking for the appropriate error terms when the estimates involve main effects, but not when involving the highest oerder interaction. Am I right?

I should not use "date" as repeated measure because the subplots were measured once at year, but not under the same conditions, because every year a different cereal cultivar was cropped, and different cereals have a different effect on nematode reproduction.

Thank you Steve!!

Caroline

Solution

09-04-2013
01:26 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to palolix

09-04-2013 01:26 PM

You have the greatest designs

The E option in PROC MIXED estimates and LSMESTIMATES gives the coefficients of the estimable functions **L**. So you don't specify the error term like you do in GLM.

I am going to worry more about the estimates. It's time to get the LSMEANS for nem*trt*cultivar*date, and check those values, and try calculating (by hand!) the LSMESTIMATEs you have. If you are getting values with the wrong sign, you may need to rethink the coefficients in your statements.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

09-05-2013 08:39 AM

Ohh Steve, Now I got it!! I did it by hand and realized that the calculated estimates were not the same from those that I got. At first I did not know why, but then I realized that I was making a very stupid mistake when writing my coefficients, because I wanted to compare the first vs the last dates, but then I went to the class table and noticed that the last date for me was not the last date for sas, because of the alfabethical order. I was doing it right by the other factors but not with date. I corrected the coefficients and now I get the sames estimates to those that I calculated by myself!!! What a stupid mistake!! I am so glad you suggested me to calculate it by hand!!

Thank you very much Steve!!!!

Caroline

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to palolix

09-05-2013 09:34 AM

I think that is why I "indexed" the dates when we were working on a previous analysis. Anyway, I am glad things are working out.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

09-05-2013 09:40 AM

Now do you think it would be a good idea to add ddfm=kr or satterweith to my model?

Thank you,

Caroline

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to palolix

09-05-2013 09:54 AM

I'd go with the Kenward Rogers adjustment first, as it applies the Harville and Jeskie adjustment to the variance-covariance matrix and then calculates the Satterthwaite df, which could be important with this pattern of missing data. However, don't be surprised if all of a sudden the denominator degrees of freedom drops to 1. The adjustment then is too conservative and I would drop back to ddfm=satterthwaite.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

09-05-2013 12:36 PM

Mmmm, I tried Kr and Satterweith and with both I get the same results; <.0001 for all the estimates and DF= 31.83 or 28.17 instead of 4. This looks very odd to me, so I would rather omit the ddfm= option in my model. Do you agree with me Steve?

By the way, I am so happy with the results from my estimates!!! Now they look like they do!

Thank you Steve!!

Caroline

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to palolix

09-05-2013 12:52 PM

No. I would definitely stay with the adjustment, the DF reflects the "true" degrees of freedom associated with the error terms appropriate to the pattern of missingness.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

09-05-2013 12:57 PM

Ok. Do you think Satterweith would be better? I am surprised that the DF goes from 4 to 31 or 45! What about adding the adjustment=simulate?

Thank you,

Caroline

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to palolix

09-05-2013 02:45 PM

DDFM=Satterthwaite may or may not give exactly the same df as KR--it all depends on the Harville-Jeskie adjustment, which is critical with missing data or repeated measures.

As far as adjusting, I would do that, but that is a personal preference. There is some philosophy to consider, such as whether the study is exploratory or confirmatory in nature, and the relative cost of making Type I and Type II errors. But I assume this is confirmatory, and controlling Type I error is more important. Given all of that--separate each of the estimates with a comma (it looks like that is the case already) and add after the last estimate

/adjust=simulate(seed=1) ;/* specify the seed for the simulation. If you don't, you could get different adjusted p-values when you rerun the analysis. I use seed=1 only for convenience.*/

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

09-05-2013 04:12 PM

Thank you very much Steve for your good suggestions and great support!!

All thes best,

Caroline

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

09-16-2013 12:57 PM

Hello again Steve,

When analyzing some dependent variables, where I only need 2 levels of the factor 'cultivar' (which has 3 levels; Nordstern, Sang and Beretta), so the number of observation that sas use are 27 out of 40, I only get the LSM but not the estimates. The problem is that in my Type 3 Test of Fixed Effects Table I only get the 'nem' and 'cultivar' main effects but not the interaction nem*cultivar (which I suppose is not estimable), so I think that is why I am not getting the estimates. When writing the estimates statements I think I should only write he main factors but not the interaction: lsmestimate nem cultivar 'label' [1, 1 2][-1, 2 2], instead of nem*cultivar 'label', but if I just write nem cultivar 'label', sas tells me that is expecting a * (nem*cultivar), so it does not work either.

Here is my code:

proc mixed data=one;

where cultivar="Nordstern" or cultivar="Sang";

class blk nem cultivar;

model ahreave=

nem

cultivar

nem*cultivar/ddfm=kr;

random blk blk*nem;

lsmeans nem*cultivar;

lsmestimate nem*cultivar 'nem1 vs nem2 | cultivar=Norstern' [1, 1 2][-1, 2 2],

'nem1 vs nem3 | cultivar=Sang' [1, 1 3][-1, 3 3],

'cultivarNord vs cultivarSang | nem=1' [1, 1 2][-1, 1 3];

run;

Is there a way to write the lsmestimate without the interaction nem*cultivar (since this is not estimable)? Do you think this is the reason why I am not getting the estimates results?

I would greatly appreciate if you could give me some light.

Thank you very much Steve!

Caroline