Hi,
I an trying to analyze an experiment with count data that I am having great difficulty finding a model that will converge that has all the appropriate random variables within the model.
The data structure is a summary count of events in an animal during a day. The summary counts are generated from a continuous stream recording of the data from multiple animals over 10 different days, which are unequally spaced. Day=(-1,0,1,7,14,23,24,25,26) The amount of the recording that is readable varies with each recording, as animal movement etc makes the recording unreadable at times. The data is originally reported by the instrument software on an hourly basis, but I have summarized it as a count on a daily basis for a couple reasons... the primary reason being that the # of events is very low and the overwhelming majority of hourly readings are 0. The outcome of interest is the # of events between trt groups (n=3). There are 2 blocks to the experiment. Animals are group housed and individually managed (EU=Animal). I've attached code below, but it doesn't work. Variables that cause problems are pen and block if the day component is the in the model. We want to know the effect of trt over time so day is crucial in the analysis.
I can get trt|day into the model if block becomes fixed. As we would like the inference to apply to 'all blocks' I would like this to be random.
Bottom line: The overdispersion in this dataset is will not allow me to fit a poisson or nb model (all tiand my understanding of GLIMMIX is that it won't do Zero inflated models. That leaves me with PROC NLMIXED, which I have no experience on how to set up the parameters for. Any advice for a new approach would be appreciated.
lday= log(days) day is in decimal for. i.e. 97% or .97 of the day's recording was readable. I think this is the right form to put this in but I'm not sure.
proc glimmix data=data plot=(residualpanel studentpanel);
class trt animal block pen day;
model Event=trt|day /solution dist=nb ddfm=kr offset=ldays;
random block;
random _residual_/subject=animal;
random int /subject=pen;
lsmeans trt|daymod;
run;
Before you go to NLMIXED (which should work, but is miserable to write code for), there might be some things that will help the convergence problem.
First off, what sort of error in convergence are you getting? I don't see an NLOPTIONS statement, so it may be as simple as increasing the number of iterations. The default in GLIMMIX is 20. So, try adding:
nloptions maxiter=200;
as a start. Then it becomes a matter of looking at messages in both the output and the log, and at the iteration history.
Some other things--I am missing something in the design. How are pen and block related? Is there a possibility of coming up with a single variable that incorporates all of the info on these two, so that there is only a single random effect?
See if either of these help. If this solves everything, then OK. If not, please attach a copy of the OUTPUT file for your follow-up, as I can tell a lot from that.
Steve Denham
Steve,
Thanks for the advice. I'll give those a try. I have had NLOPTIONS statement in the model on previous attempts using tech=nnridge, and it didn't change the outcome.
As far as experimental design goes. Pen contains 5 animals (EU) on the same treatment. Each block (blocked by time) has 1 pen per treatment. Animals on the same treatment (3 pens/block) are housed together, but treatment is administered individually. the random int/subject=pen was included to allow for covariance resulting from animals being housed together. Due to potential for interaction of measured outcome of interest and treatments we could not put every treatment in every pen, which would be a better design if the interaction didn't exist.
perhaps pen(block) would work?
Increased iteration resulted in the same problem
output attached
Model Information | |
Data Set | WORK.SUMBYCOMBINATION |
Response Variable | Event |
Response Distribution | Negative Binomial |
Link Function | Log |
Variance Function | Default |
Offset Variable | ldays |
Variance Matrix Blocked By | Animal |
Estimation Technique | Residual PL |
Degrees of Freedom Method | Between-Within |
The GLIMMIX Procedure
Class Level Information | ||
Class | Levels | Values |
trt | 3 | 1 2 3 |
Animal | 29 | 659 661 666 670 674 689 693 703 704 705 706 707 709 710 711 715 730 731 735 736 739 740 741 742 743 744 745 747 753 |
block | 2 | 1 2 |
pen | 6 | 1 2 3 4 5 6 |
daymod | 10 | -1 0 1 7 14 22 23 24 25 26 |
Number of Observations Read | 273 |
Number of Observations Used | 273 |
Dimensions | |
R-side Cov. Parameters | 3 |
Columns in X | 46 |
Columns in Z per Subject | 0 |
Subjects (Blocks in V) | 29 |
Max Obs per Subject | 10 |
Optimization Information | |
Optimization Technique | Dual Quasi-Newton |
Parameters in Optimization | 3 |
Lower Boundaries | 2 |
Upper Boundaries | 0 |
Fixed Effects | Profiled |
Starting From | Data |
Iteration History | |||||
Iteration | Restarts | Subiterations | Objective | Change | Max |
0 | 0 | 200 | -455.6438596 | 2.00000000 | 29997.59 |
1 | 0 | 9 | -1.94979402 | 2.00000000 | 1.095774 |
2 | 0 | 9 | 407.11750287 | 2.00000000 | 0.138335 |
3 | 0 | 9 | 728.38921391 | 2.00000000 | 0.460276 |
4 | 0 | 8 | 931.11014255 | 2.00000000 | 0.061218 |
5 | 0 | 8 | 1041.5867339 | 1.69680719 | 1.935949 |
6 | 0 | 6 | 1102.2001401 | 2.00000000 | 0.179453 |
7 | 0 | 5 | 1130.2017164 | 2.00000000 | 0.566165 |
8 | 0 | 4 | 1146.2510031 | 0.79829980 | 5.150145 |
9 | 0 | 200 | 1132.0235691 | 1.99938081 | 165205.1 |
10 | 0 | 3 | 1134.1023687 | 0.29734249 | 151.1646 |
11 | 0 | 3 | 1135.3279743 | 0.22783827 | 51.98438 |
12 | 0 | 3 | 1136.4465168 | 0.18472565 | 20.2837 |
13 | 0 | 4 | 1137.6649059 | 0.15563163 | 0.111308 |
14 | 0 | 3 | 1139.4294709 | 0.13508566 | 6689.262 |
15 | 0 | 3 | 1142.7088799 | 0.12034232 | 8694.592 |
16 | 0 | 3 | 1149.3769493 | 0.12375223 | 32.88927 |
17 | 0 | 4 | 1162.5838831 | 0.14883198 | 0.460491 |
18 | 0 | 3 | 1186.4382959 | 0.17336505 | 1.185575 |
19 | 0 | 3 | 1223.6597018 | 0.20593961 | 0.833072 |
20 | 0 | 3 | 1273.2062406 | 0.25945083 | 5.382639 |
21 | 0 | 3 | 1332.0993857 | 0.35533534 | 7185.176 |
22 | 0 | 3 | 1399.3189684 | 0.52576278 | 8350.74 |
23 | 0 | 3 | 1479.4357778 | 0.88493275 | 0.514068 |
Did not converge. |
Covariance Parameter Estimates | |||
Cov Parm | Subject | Estimate | Standard Error |
Variance | animal | 9.794E-7 | . |
CS | animal | 1.332E-7 | . |
Scale | 4609159 | . |
Thanks
It looks like the number of function calls is what is causing the lack of convergence. Those iterations with 200 subiterations look dangerous.
Try:
nloptions maxiter=200 maxfunc=2000;
You may want to try tech=congra in there as well.
I think we are in a really bad part of the likelihood surface, if the covariance parameters are that small. Can you reparameterize--say by multiplying the dependent variable by 1000?
Steve Denham
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.