Re: Infinite Likelihood error Proc Mixed

vvk · Posted 06-28-2014 05:39 PM

Hi All,

I am using proc mixed model, and i got the following warning.

WARNING: Stopped because of infinite likelihood.

WARNING: Output 'lsmeans' was not created. Make sure that the output object name, label, or path

is spelled correctly. Also, verify that the appropriate procedure options are used to

produce the requested output object. For example, verify that the NOPRINT option is not

used.

The model is working fine with individual visit data but when I run the model with all the visits it failed.

No duplicate records, no missing data. The same model worked for even lesser population.

Model Used:

proc mixed data=dsnin;

class treatment visit subjid ;

model pchg = treatment visit treatment *visit base/solution ddfm=satterthwaite;

repeated visit/subject=subjid type=un;

lsmeans treatment *visit/cl alpha=0.05;

ods output lsmeans=means;

run;

Any kind of help is greatly appreciated.

Thank you.

SteveDenham · Posted 06-30-2014 09:50 AM

Almost certainly the infinte likelihood is due to duplicate records. For each subjid, there can be only one record per "visit". If the subjid is not unique within treatment, then this error will occur. If that is the case, then change subject=subjid to subject=subjid*treatment, and all should be well. Otherwise, you are going to have clean your data so that there are no duplicate records for visit.

Steve Denham

vvk · Posted 06-30-2014 10:11 AM

Thanks Steve for your response. I will try that and check if it works.

Ahmed_Attia · Posted 06-30-2014 10:55 PM

Hi Steve,

I have a question about error source in split-split plot design. I have combined two years of data as follows;

class Year Rep Irrig Tillage Variety;

model HJul24 = Irrig|Tillage|Variety;

Random Rep(Year) Rep(Year Irrig) Rep(Year Irrig Tillage);

Irrig was the main plot, Tillage was the subplot, and Variety was the sub-subplot in three replicates. I would like to consider year in the random model. My question is; should I include the Tillage in the random model or not. As I will say in the paper;

Irrigation, tillage, variety, and their interaction were considered as fixed effects. Year, replicate, and their interaction with irrigation and tillage were considered as random effects.

Thank you very much.

Ahmed Attia

SteveDenham · Posted 07-01-2014 07:27 AM

Hi Ahmed,

If you have sufficient data to estimate the variance components with tillage included, I would put them into the random statement. However, I would be very surprised if the message "G matrix is not postive definite" or some such does not appear. The reason for still including them is that they are part of the design, and the degrees of freedom for tests should reflect the "skeleton ANOVA" (see Stroup's Generalized Linear Mixed Models for more on this).

Steve Denham

Miracle · Posted 03-15-2021 09:13 AM

Hi @SteveDenham.
I am also getting this warning - WARNING: Stopped because of infinite likelihood.
I have tried different covariance structure as advised here.
And there is only one observation row for every VIS from ID.
Any advice from anybody will be very much appreciated.

proc mixed data=final;
by labparam;
class ID resp sex tx VIS;
model chg=base resp age sex tx VIS tx*VIS / s ddfm=kenwardroger;
lsmeans tx*VIS / cl;
repeated VIS / subject=ID type=ar(1);
/* repeated VIS / subject=ID type=vc;*/
/* repeated VIS / subject=ID type=cs;*/
/* repeated VIS / subject=ID type=un(1);*/
/* repeated VIS / subject=ID type=toep;*/
slice tx*VIS / sliceby=VIS pdiff cl;
run;.

SteveDenham · Posted 08-27-2021 09:10 AM

Just a thought - it looks like your response variable is a change of some sort, and you have a continuous covariate base in the model. If your response is change from base, it is possible that as iterations progress, the high correlation between these two leads to singularity of the matrix, which might be causing this.

Question: how many iterations occur before the infinite likelihood error is reached? If it is immediate, then you DO have some sort of duplicate record in there. Look at the cross-tabulation for sex*resp*tx*vis. The count in each cell should be equal to the number of unique IDs.

Other possibilities include age dependent missingness, base dependent missingness, and complete confounding of age and base, I am kind of swinging in the dark here, so I am going to at_mention @jiltao who is one of the authors of the paper here

Several examples of duplication that could be missed are given there, and a pretty strong case is made that only some sort of duplication leads to this error.

SteveDenham

ccwangxi · Posted 09-06-2021 05:04 PM

Hi Steve,

Thank you so much for the instruction! You are right. The infinite likelihood error occurred immediately, and there were 4 replicates. I did not discover this because I used the wrong keyword "nodup" rather than "nodupkey" in PROC SORT.

My new question is how to specify the variance-covariance structure of intensive longitudinal data. I always start from the most saturated variance structure (unstructured RANDOM and simple RESIDUAL) and then simplify based on the dataset and the study (e.g.., whether AR(1) is reasonable). Would you have any recommended resources on this topic?

Best,

Ada

SteveDenham · Posted 09-07-2021 10:09 AM

I almost never look at the unstructured/Cholesky structure unless there are fewer than 5 timepoints, since most of the data I work with have a limited number of subjects. Remember, 5 timepoints require estimation of 15 parameters. If your data is "intense", say with 20 timepoints, you are talking 210 parameters. Using a rule of thumb I came across in some R vignette, that would require roughly 2100 subjects to be able to get stable estimates (R side) or have a positive definite G matrix (G side).

So, I tend to think about the process that generates the data. If you can exchange any two time points, then compound symmetry type covariance structures are good. If there is some sort of correlation between errors, then the autoregressive or spatial structures are usually a better choice. One thing is that if the data are unchanged and the errors in PROC MIXED are normal, you can use information criteria (AIC, AICc, BIC) to rank the models in order of retained information. For generalized mixed models, it becomes more difficult, but as long as you deal with single parameter distributions, this method is well defined.

For more information, I recommend Walt Stroup's Generalized Linear Mixed Models: Modern Concepts, Methods and Applications, as well as SAS for Mixed Models, 3rd edition.

SteveDenham

ccwangxi · Posted 08-26-2021 11:24 AM

Hi Steve,

I have a similar error and tried "repeated time/subject=subj group=treatment type=ar(1);" and "repeated treatment*time/subject=subj type=ar(1);". But there is still an error. Would there be other reasons for this error?

Ada

QiQi_22 · Posted 02-03-2021 11:34 PM

Hi,

I got the same problem with you, and I solved it by replacing the "type = un" to "type = cs".

Repeated statement specifies the structure of the R matrix of residual variances and covariances.

Type here specifies the structure of R .

You can dig into this problem more if you like.

Miracle · Posted 03-17-2021 12:50 AM

Thanks @QiQi_22 for your reply.
I have already tried that but still getting the warning.

jiltao · Posted 08-27-2021 11:01 AM

What happens if you add the singular=1e-8 option in the MODEL statement in PROC MIXED?

If that does not help, then I would need to have your data set to look into it further, as this type of issue is often data/model dependent.

Thanks,

Jill

Brendan42 · Posted 03-24-2022 11:36 AM

Sorry to bump an old post, but adding singular=1e-8 option worked for me.

I'm working with a large multiyear agricultural dataset (30 years, 2 sites, 4 blocks, 4 levels treatment 1, 5 levels treatment 2).

Spit plot design.

I split the data into 4 subsets to make the analysis more manageable (sites analyzed separately, and then around 2004 there was a break in the data due to a summer fallow).

The models were much faster this way, and AR1 and ARH1 covariance structures remained options because there was no year gap.

The random effects structure gets a little complicated (https://link.springer.com/article/10.1007/s13593-021-00681-4), but 3 of the 4 datasets ran very well with the same code.

The fourth dataset just would not run without the infinite likelihood error on iteration 1.

No duplicate records that I could find.

Adding singular=1e-8 got it to run, and the results look reasonable.

Thanks!

Classroom Training Available!