turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- GLIMMIX warning: obtaining minimum var. quadratic ...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-31-2012 02:05 PM

Hello All,

Sorry for the long discussion title. I've found a couple previous discussions on this warning, but none of the solutions proposed seem to fix my problem. So, briefly here is a description of the data I'm trying to analyze, the problem I'm having, and the solutions I've already tried:

Data is from a pollination experiment focusing on 2 species. I had

- 10 plots randomly assigned to 2 main treatments: Unmanipulated, and Removal

- within each plot, 10 randomly selected individuals were monitored

- of the 10 individuals in each plot, 3 were assigned to a pollen supplementation treatment, the other 7 were left unmanipulated.

- I collected seed count data for every flower on every individual as my response variable, and recorded flower position on the flowering stem as an important covariate.

I am treating individuals as subjects for which i have repeated measures (multiple flowers per individual). I have strongly correlated within-individual responses with respect to flower position, so, I have tried to account for those correlated responses with my R-side covariance structure as follows:

**proc** **glimmix**;

class treatment plot poll_supp flowpos_1;

model nseeds = treatment poll_supp flowpos treatment*poll_supp

treatment*flowpos poll_supp*flowpos treatment*flowpos*poll_supp

/ dist=negbin offset=lntovules

ddfm=kenwardroger solution;

random flowpos_1/ subject=ind type=ar(**1**) residual;

random plot;

**run**;

However, no matter how I tweak this code (e.g. dropping the higher order interactions, using different dist=, ddfm=, with or w/o offset=), so long as I include "random flowpos_1/ subject=ind type=ar(**1**) residual;" it blows up and gives me the Warning: obtaining minimum var. quadratic unbiased estimates as starting values for the covariance parameters failed. At first, I thought it was just a d.f. deficiency, so I dropped interactions, and used a copy variable flowpos_1 to order the observations for the random statement, using flowpos as a continuous variable instead (this is actually a preferable formulation, but I originally tried it to sneak back some d.f.'s). Plus, the dataset has ~407 observations, and fingermath on the number of parameters being estimated checked out ok, so I think I am ok for d.f. My second thought was that this is just a numerical issue because of strong imbalance in the observations at different levels of flowpos (many individuals with 1-10 flowers, very few with 11,12; actually only 2 observations at 12, w/ variance=0)... so I tried subsetting the dataset and dropping the observations at the highest levels of flowpos... but am still having the same issue...

So... My conclusion is that I am writing the code for that random statement incorrectly, but can't find any good examples to follow. If anybody has any advice or suggestions I would appreciate any help I can get. I know there are alternative formulations to account for my within individual correlated responses, but this seems like the formulation that does the best job of staying true to the biology in the experiment, and if I can avoid dropping the R-side covariance structure to account for within-individual correlated responses with respect to flowpos, I'd like to.

thanks again for any help or suggestions.

cheers,

Colin

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2012 08:26 AM

I think this is a data dependent problem, more than a syntax problem. I will bet that most of your information is at a limited number of flowpos values. Reparameterizing these to something where the responses are more evenly spaced could help if that is the situation.

I also think you would have better luck not having two versions of the flowpos variable. Without data to try this on, what about sorting the data set **by plot treatment poll_supp ind flowpos?**

Then trying to execute:

**proc** **glimmix data=sorteddata initglm scoring=5**;

class treatment plot poll_supp flowpos ind;

model nseeds = treatment poll_supp flowpos treatment*poll_supp

treatment*flowpos poll_supp*flowpos treatment*flowpos*poll_supp

/ dist=negbin offset=lntovules

ddfm=kenwardroger solution;

random flowpos/ subject=ind type=ar(**1**) residual;

random plot;

**run**;

I included ind as a class variable. The subject=ind assumes that all individuals are uniquely identified, so that an infinite likelihood is avoided. The initglm option should generate starting values based on excluding the random effects, and the scoring=5 will give five iterations of Fisher scoring to get things started in the right direction (hopefully).

Good luck on this one.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2012 04:56 PM

Hi Steve,

Thanks for taking the time to look this over. My dataset is ordered by plot-ind-flowpos, and both plots and individuals are uniquely identified. Since treatments were applied to whole plots, and poll_supp applied to whole individuals, sorting by those two variables doesn't change the ordering. Unfortunately, I tried your suggestions (and a few tweaks on them), and still no luck... but now have more data (so to speak) on the problem.

I tried chopping the data down so I had close to a balanced dataset with respect to flowpos, still the same warning... this, and a conversation with my supervisor led to thinking that the problem may have to do with my choice of ar(1) for the covariance structure... I did have plants that were missing flowers (e.g. - had flowers 1,2, ,4,5, with flower 3 damaged or missing), and since ar(1) requires equal spacing (as you mentioned), we decided to try using the random statement:

random _residual_ / subject=ind type=sp(exp)(flowpos_1);

I tried all the previous formulations and options with this new random statement, and still get the same warning message... in fact, there are only two things I can do that will not result in this warning message:

1) do not include flowpos in any way to order observations for estimating covariance parameters. For example, the model will run, and converge if I use: "random _residual_ / subject=ind type=ar(1) vcorr ;"

However, my vcorr matrix is 405x405 (my total # of observations), and has a strange block-diagonal form, with blocks that correspond to *plot* rather than *ind*. This seems strange to me, and I'm not sure what in this random statement is telling SAS to do this. I get the warning if I use:

"random _residual_ / subject=ind type=sp(exp)(flowpos_1) vcorr ;"

2) If I do not tell SAS to treat this as an R-side effect, it will at least go through optimization iterations (but fail to converge). The minute I tell SAS to treat this as an R-side effect, I get the same old warning. But this is clearly an R-side covariance issue. ....

So, I am vexed. Why would incorporating flower position as an index for the covariance structure cause this particular problem? and why does the fact that this is R-side result in failure to obtain minimum variance quadratic unbiased estimates as starting values for the covariance parameters?!? Hmm...

anyway, thanks again for taking the time to think this through. I appreciate it. My supervisor will probably take a crack at it on his own time... so while I'm farming out help, if you are interested enough to fiddle with this as well, I'd be happy to provide a copy of the relevant data. At this point I'm committed to figuring this out on principle!

thanks again,

Colin

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2012 05:32 PM

EUREKA! sort of...

I discovered why I was getting 'The Warning': While MOST of my observations came from flowers on the main flowering stalk, al had a few from axial stems. Flowers on axial stems were numbered 1,2,3,... just like the flowers on the main stalk. This led to duplicate flowpos values, which blew up the covariance parameter estimation. As this was my preliminary foray into this analysis, i didn't even include the variable 'axial'...

however, I still have a couple things I don't fully understand. Even after putting a band-aid over the axial flower issue by restricting my analysis to flowers on the main stalk, I still have a strange vcorr matrix: a #obs x #obs block diagonal matrix by plot. Also, the Dimensions output reports "Max Obs per Subject: 403"... I would have thought that this should be the maximum flowpos value: 12. Is there something I'm not understanding about how the R-side covariance matrix is constructed?

thanks again,

Colin

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-04-2012 08:02 AM

Hi Colin,

A couple thoughts--I think GLIMMIX defaults to the G matrix when counting subjects. Not really sure, but that is what it looks like. For repeated measures in time, we often use AR+RE models (autoregressive plus random effect), which gives the correct count of subjects.

These look like:

proc glimmix data=for_Stats1 ic=pq;

by param ;

class sex grp_no studyday anml_nbr ;

model value = grp_no

sex

studyday

grp_no*sex

sex*studyday

grp_no*studyday

grp_no*sex*studyday

cov/ ddfm=kr(firstorder);

random studyday /residual type=&covtype subject= anml_nbr ;*group=grp_no;

%if "&covtype" = "AR(1)" OR "&covtype" = "ARH(1)" %then %do;

random intercept/subject=anml_nbr;

%end;

ods output fitstatistics = &outdata._a convergencestatus = &outdata.status ;

run;

You may need something similar to the second random statement to get the subject number where you want it.

My other thought is: Why AR(1) as a structure? I would really go with the spatial covariances like sp(exp) or sp(pow), as you indicated above, given that the indexing reflects some sort of measurement. These are a lot better at handling missing values (my opinion only). The other alternative might be a heterogeneous compound symmetry (CSH), as both AR(1) and the spatial covariances really depend on the indexing. If things are unevenly spaced in reality, but indexed as 1, 2, 3, ..., you are making an assumption about the correlations that might be misleading.

Good luck.

Steve Denham

Message was edited by: Steve Denham