BookmarkSubscribeRSS Feed
michel
Calcite | Level 5

Hi,

I posted here with a question about using PROC GLIMMIX on my own repeated-measures data, and got some great answers!  Now I'm helping a colleague analyze his repeated-measures data, and I can't get it to converge. 

The response variable (response) is a count variable ranging from 0-19, but very left-skewed with many zeros and ones (to give you an idea: median=2, 95th percentile=8)

Fixed effects: sex (M/F), treatment (4 levels, categorical), and trial.  Trial is the proxy for time, and this is where things get complicated: subjects (=bird) were subjected to as many as 12 trials, but if they "passed", the trials would be stopped early.  Some birds went through as few as 6 trials.

Random effect: nest.  Many of the birds came from the same nest (i.e., are related).

Here's my code:

proc glimmix data=MyData plots=(all) method=RSPL noreml IC=PQ plots=residualpanel order=data;

  Class bird sex treatment trial nest;

  tr = trial;

  model response = sex|treatment|trial /  dist=poisson link=log ddfm=kr; 

  random trial / residual subject=bird(nest) type=sp(pow)(tr);

  random int / subject=nest type=chol;

  random _residual_ / subject=bird(nest);

  nloptions maxiter=500 tech=congra;

run;

I have tried multiple variations of this code, including:

- method=RMPL, MMPL, MSPL

- dist = negbin, gamma, gaussian/log

- ddfm = betwithin, contain, {default}

- R-side covariance structures = ar(1), ante(1)

- G-side covariance structures = un, {default}

- removing bird from the class statement (the data is sorted by "bird trial")

- I've also attempted running this model with all G-side random effects and using method=LAPLACE or QUAD.

Nothing works.  Either I can't get it to converge - in which case I get the "pseudo-likelihood update fails in outer iteration #" error message.  Or it converges, but all the F values are higher than I would expect, and most-all LS means are "Non-est".  I was also getting a "data label collision avoidance" error for a while, but was able to avoid that by entering "LABELMAX = 12400" to the ODS GRAPHICS statement.

Any suggestions?  Thanks in advance!

3 REPLIES 3
SteveDenham
Jade | Level 19

Before we get into NLMIXED or FMM to handle zero inflated distributions, try one more--all G side random effects, method=LAPLACE, dist=negbin, scoring=fisher, remove bird as an effect, keeping it only as a subject, technique=trureg.

proc glimmix data=MyData plots=(all) method=laplace scoring=500 exphessian IC=PQ plots=residualpanel order=data;

  Class bird sex treatment trial nest;

  tr = trial;

  model response = sex|treatment|trial /  dist=negbin link=log ddfm=bw; 

    random int / subject=nest type=chol;

    random trial /  subject=bird(nest) type=sp(pow)(tr);

  nloptions maxiter=500 tech=trureg;

run;

But I have to confess that I think this will blow up as well.  I believe the data is a mixture of distributions--a zero inflated binomial, and a zero inflated poisson.  Check out PROC FMM.  I have never used it, but it might be the only thing that can handle what is happening here.

Steve Denham

michel
Calcite | Level 5

Hi Steve,

Thanks again for your help!  As you suspected, this still didn't solve the convergence problem.  I think you're right that there are multiple distributions here, and the sample size is too small.  I'll look into Proc FMM.

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

For problems like this, it may be helpful to start with a simpler model, such as one with no random effects. This will tell you if you have some problems with the coding of the data (for the fixed-effects part) , or if you have evidence of finite mixtures. Look at the residual plots. Then start adding the relevant random effects. You do have problems with your random effect syntax. First of all, with negative binomial, you automatically have an overall residual (R-side) scale parameter for the conditional distribution. But then in your original post you have two separate random statements that are both coding the R-side residual scaling parameter (either using the RESIDUAL option or the _RESIDUAL_ keyword). This is creating all kinds of conflicts and overparameterizations. Plus, I think you will get very strange results (and maybe not interpretable results) by trying a temporal autocorrelation structure for the R-side residual. If you read the recent discussion in the SAS/STAT discussion board, you will see some arguments against R-side repeated measures for GLMMs (open to debate). Also, the chol option won't do anything here because you have a simple variance component term for this statement. Start with a simpler random structure:

random int bird / sub=nest;

which is equivalent to

random nest bird(nest);

At least, this will let you get started. If this works,then you can consider more complex models (if needed).

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1343 views
  • 3 likes
  • 3 in conversation