Solved: Re: Random _residual_ and log likelihood in GLIMMIX

SteveDenham · Posted 01-25-2021 03:30 PM

This may be simple, but when covering the use of the /residual option vs _residual_, I said that while tests, lsmeans, standard errors, etc. were identical, the log likelihood and the information criteria were not. I wanted to share the data (scrubbed) and code that gave this result:

data one;
input subject treatment $ period sequence cmax auctlast;
datalines;
  301 A 1 1 1640 3915.57
 301 B 2 1 1210 2824.75
 302 A 2 2 553 1418.73
 302 B 1 2 608 1422.28
 303 A 1 1 666 1456.27
 303 B 2 1 369 1225.29
 304 A 2 2 604 2242.25
 304 B 1 2 708 2419.4
 305 A 1 1 1480 2993.16
 305 B 2 1 589 2500.15
 306 A 2 2 1240 2506.69
 306 B 1 2 646 1798.03
 307 A 1 1 1510 4839.33
 307 B 2 1 1040 3647.96
 308 A 2 2 513 1865.58
 308 B 1 2 441 1708.85
 309 A 1 1 1540 6989.98
 309 B 2 1 1440 4622.6
 310 A 2 2 1130 2906.11
 310 B 1 2 930 2884.38
 311 A 1 1 927 3505.7
 311 B 2 1 547 2775.85
 312 A 2 2 865 1501.44
 312 B 1 2 369 1197.5
 313 A 1 1 1050 2450.36
 313 B 2 1 877 2018.7
 314 A 2 2 834 3021.77
 314 B 1 2 690 2317.79
 315 A 1 1 432 1489.51
 315 B 2 1 499 1221.48
 316 A 2 2 1580 2594.55
 316 B 1 2 1240 2147.76
 317 A 1 1 1710 3136.27
 317 B 2 1 864 1801.83
 318 A 2 2 1360 4083.53
 318 B 1 2 984 3129.96
 319 A 1 1 1200 3146.07
 319 B 2 1 774 2913.55
 320 A 2 2 2450 5683.17
 320 B 1 2 1170 3792.46
 321 A 1 1 1090 2905.69
 321 B 2 1 685 2363.57
 322 A 2 2 543 1788.4
 322 B 1 2 522 1688.75
 323 A 1 1 1470 3497.65
 323 B 2 1 1330 3421.19
 324 A 2 2 1360 3289.24
 324 B 1 2 1500 3779.08
 325 A 1 1 1340 2357.89
 325 B 2 1 448 2207.14
 326 A 2 2 1220 5231.7
 326 B 1 2 1380 4126.9
 327 A 1 1 1310 3408.15
 327 B 2 1 902 3291.73
 328 A 2 2 1190 2653.16
 328 B 1 2 679 2268.26
 329 A 1 1 730 1898.94
 329 B 2 1 498 1501.13
 330 A 2 2 1220 2513.64
 330 B 1 2 719 2185.22
 331 A 1 1 935 2729.66
 331 B 2 1 690 2246.27
 332 A 2 2 1290 3715.9
 332 B 1 2 971 2890.33
 333 A 1 1 831 2321.1
 333 B 2 1 562 1639.56
 334 A 2 2 1070 3339.18
 334 B 1 2 982 3254.65
 335 A 1 1 827 2198.37
 335 B 2 1 534 1840.94
 336 A 2 2 1350 4001.55
 336 B 1 2 1010 3577.4
;

data one;
set one;
if subject<=318 then room=1;
	else room=2;
lcmax=log(cmax);
lauctlast=log(auctlast);
run;

proc sort data=one out=two;
by descending treatment subject room sequence period ;
run;

proc glimmix data=two; * order=data;
class subject room sequence period treatment;
model lauctlast=treatment period sequence/ddf=35,35,35;
random intercept subject/subject=room;
random period/subject=subject(room) residual type=arh(1);
lsmeans treatment / diff=controll cl;
ods output lsmeans=lsmlsauctlast diffs=difflsauctlast;
run;
proc glimmix data=two;* order=data;
class subject room sequence period treatment;
model lauctlast=treatment period sequence/ddf=35,35,35;
random intercept subject/subject=room;
random _residual_/subject=subject(room) residual type=arh(1);
lsmeans treatment / diff=controll cl;
ods output lsmeans=lsmlsauctlast2 diffs=difflsauctlast2;
run;

The first glimmix gives 4.40 as -2 res log likelihood, and takes 7 iterations to converge. The second gives 3.12 as -2 res log likelihood and takes 13 iterations. both have the same generalized chi squared = 68.00.

So, what is the origin of the difference in log likelihoods?

SteveDenham

jiltao · Posted 01-26-2021 10:45 AM

Steve,

This goes back to my original comments: when you do not specify the repeated effect (or the R-side random effect in PROC GLIMMIX), the results can be different if you have missing values or the repeated measures are sorted differently.

You sorted your data by descending order of treatment. That makes some subjects to have the order for PERIOD as 2 1, rather than 1 2. So without the PERIOD R-side random effect, the two approaches basically are analyzing different data.

This issue also applies to your ARH(1) model with the random subject*room effect, sorry I did not realize it until now.

Thanks,

Jill

View solution in original post

jiltao · Posted 01-25-2021 05:35 PM

This can be tricky.

The output shows that the standard errors for some covariance parameters are missing from the two PROC GLIMMIX programs, suggesting some instability in the optimization process, and this is often caused by an over-specified model.

You only have two repeated measures per subject(room), and specifying the ARH(1) covariance structure, in addition to modeling the random effect subject*room (essentially), results in an over-specified model. Although PROC GLIMMIX converged, the fact that some standard errors are missing for some covariance parameters confirmed this. You would need to simplify your model, for example, take SUBJECT out of the first RANDOM statement.

Two different specifications of your PROC GLIMMIX programs should produce identical results for your data if the model is not over-specified. When it is, different specifications might make the procedure to take a slightly different path in the optimization process and land in different locations. So my suggestion is to simplify your model so you get a clean convergence before comparing the results.

Hope this helps,

Jill

SteveDenham · Posted 01-26-2021 08:18 AM

Good point, @jiltao about the standard errors. Let me talk a bit about the study. It is an AB/BA cross-over for non-inferiority in dogs. Due to the sample size, it was run in two rooms, so regulatory science requires room as a random effect - and I can't remove it. The other factors in the model are typical for the design.

Rather than using an autoregressive structure with a random subject effect, the following code uses a Cholesky factorization for an unstructured matrix ,which should eliminate the random intercept for subject. Now, the two methods still give different log likelihoods, but at least the lsmeans and standard errors differ. All covariance parameters are estimated as are their standard errors.

Here is the code:

proc glimmix data=two; * order=data;
class subject room sequence period treatment;
model lauctlast=treatment period sequence/ ddf=32,32,32;
random intercept /subject=room;
random period/subject=subject(room) residual type=chol;
lsmeans treatment / diff=controll cl;
ods output lsmeans=lsmlsauctlast diffs=difflsauctlast;
run;
proc glimmix data=two;* order=data;
class subject room sequence period treatment;
model lauctlast=treatment period sequence/ ddf=32,32,32;
random intercept /subject=room;
random _residual_/subject=subject(room) type=chol ;
lsmeans treatment / diff=controll cl;
ods output lsmeans=lsmlsauctlast2 diffs=difflsauctlast2;
run;

Again the first glimmix gives 4.40 as -2 res log likelihood, and takes 7 iterations to converge. The second gives 3.12 as -2 res log likelihood and but now also takes 7 iterations. Both have the same generalized chi squared = 68.00. The first yields identical standard errors for the lsmeans (0.07233) while the second yields standard errors that are about 30% larger and differ for the two means (0.09770 and 0.09461). However, the standard errors of the difference between the two means are identical for the two representations. I suspect that using the _residual_ method somehow removes the equality constraint (adds overdispersion), but that is not apparent from the Details in the documentation. My preference is to always specify the repeated factor in the random statement (and in the model). However, this whole experience is causing me a fair amount of confusion.

jiltao · Posted 01-26-2021 08:57 AM

I got the same results running the two programs.

What version of SAS are you running?

SteveDenham · Posted 01-26-2021 09:01 AM

@jiltao

SASv9.4(TS1M3). I have access to TS1M6 (STAT15.1). Let me run there and see what happens.

SteveDenham

SteveDenham · Posted 01-26-2021 09:09 AM

The results I get from SAS/STAT15.1 are identical to the results I obtained from SAS/STAT14.1, Jill. Do you match either of the ones here?

SteveDenham

jiltao · Posted 01-26-2021 10:45 AM

Steve,

This goes back to my original comments: when you do not specify the repeated effect (or the R-side random effect in PROC GLIMMIX), the results can be different if you have missing values or the repeated measures are sorted differently.

You sorted your data by descending order of treatment. That makes some subjects to have the order for PERIOD as 2 1, rather than 1 2. So without the PERIOD R-side random effect, the two approaches basically are analyzing different data.

This issue also applies to your ARH(1) model with the random subject*room effect, sorry I did not realize it until now.

Thanks,

Jill

SteveDenham · Posted 01-26-2021 10:51 AM

Jill,

That sound you heard a minute ago was me slapping my forehead. You kept saying that the data were different, and since there were no missing values, it had to be in the sort. Once sorted with period as the first key, the two methods agree completely. Thanks for taking all of the time to help out on this.

SteveDenham

Registration is open