Re: Nonlinear Mixed Effect Model Analysis

Sandhu · Posted 12-27-2012 07:07 AM

Hello.. I am doing analysis of repeated measure data obtained from a clinical trial. There are total 12 time points (one baseline and 11 follow-up) and time is coded in Hours with baseline time coded as "0". Outcomes of trial are three, one continuous (pain assessment using 100 mm VAS scale) and two binary (both binary outcomes coded as 0 and 1). Two predictors (covariates) are sex (male coded as 0 and female coded as 1) and age groups (pre-adolescent coded as 0 and post-adolescent coded as 1). Therefore from statistical analysis point of view, it is a 2x2 factorial design. Objective of my trial are to evaluate the main effect and interaction effect of age and sex on pain (VAS score). The mean profile plot shows a nonlinear quadratic trend of VAS score. I am familiar with PROC MIXED AND PROC GLIMMIX procedure and one possibility is to use polynomial random coefficient model analysis using these two aforementioned procedures. But my objective is not to investigate the trend of pain but to evaluate the overall effect of age, sex and age sex interaction on VAS score. Therefore, i was just thinking about the use of PROC NLMIXED in this case. But honestly speaking, i find it very difficult to use NLMIXED procedure and not able to understand syntax and its application. for example, how to get initial PARMS values etc etc. PROC GLIMMIX is very versatile procedure and i am using it for my two binary outcomes. Is there any possibility that i can use PROC GLIMMIX for nonlinear longitudinal data analysis instead of using PROC NLMIXED?

Regards

SteveDenham · Posted 12-27-2012 07:48 AM

When I think of nonlinear, I assume it to mean nonlinear in the parameters, such that a polynomial is not what I think of as nonlinear. Given that, it makes perfect sense to me to use GLIMMIX to explore this. If you have access to SAS/STAT 12.1, check out the EFFECT statement. It will enable you to examine polynomial effects specifically (as well as spline effects, which may be more appropriate).

Steve Denham

Sandhu · Posted 12-27-2012 09:24 AM

Dear Steve Denham,

Thanks a lot for replying. I am using SAS 9.2 and do not have access to SAS/STAT 12.1. Following is the code that i can think of fitting a second order (quadratic) polynomial trend by using GLIMMIX

proc glimmix data=data ORDER=Data nobound;

class id sex age;

model vas = time time*time

sex age sex*age

sex*time age*time sex*age*time

sex*time*time age*time*time sex*age*time*time

/ ddfm=kr s cl ;

random int time time*time / subject=id type=un v;

lsmeans sex / pdiff adjust=simulate(acc=.0005 seed=121211 report) adjdfe=row cl;

lsmeans age / pdiff adjust=simulate(acc=.0005 seed=121211 report) adjdfe=row cl;

lsmeans sex*age / pdiff adjust=simulate(acc=.0005 seed=121211 report) adjdfe=row cl;

run;

is this code appropriate for analysis of quadratic trend? should i use random _residual_ statement also to model covariance structure e.g.

random _residual / sub=id type=sp(pow)(xtime). i am thinking of spatial power because my time points are not equally spaced. i can use xtime as class variable and time as continuous variable. So code will be like this:

proc glimmix data=data ORDER=Data nobound;

class id sex age xtime;

model vas = time time*time

sex age sex*age

sex*time age*time sex*age*time

sex*time*time age*time*time sex*age*time*time

/ ddfm=kr s cl ;

random int time time*time / subject=id type=un v;

random _residual / sub=id type=sp(pow)(xtime);

lsmeans sex / pdiff adjust=simulate(acc=.0005 seed=121211 report) adjdfe=row cl;

lsmeans age / pdiff adjust=simulate(acc=.0005 seed=121211 report) adjdfe=row cl;

lsmeans sex*age / pdiff adjust=simulate(acc=.0005 seed=121211 report) adjdfe=row cl;

run;

is this code for analysis and multiple comparison by using lsmeans statement appropriate? or should i use ESTIMATE statement for doing multiple comparisons.

Thanks a lot for the help

Regards

Sandhu

SteveDenham · Posted 12-27-2012 09:37 AM

This looks exactly like what you will need. I will be curious to find out if the random _residual_ statement will work. I think it should, but I worry some that if you have a large number of xtime levels, you may have convergence problems.

Good luck, and please let the community know if this approach worked.

Steve Denham

Sandhu · Posted 12-27-2012 10:46 PM

Dear Steve Denham,

I tried this code with random _residual_ statement for my data which comprised of 280 subject and each subject had 12 repeated measurements. your prediction was correct. It did not work. The log was :

NOTE: Convergence criterion (ABSGCONV=0.00001) satisfied.

WARNING: Optimization stopped because of infinite objective function.

NOTE: PROCEDURE GLIMMIX used (Total process time):

real time 1.01 seconds

cpu time 0.96 seconds

It converged but "Optimization stopped because of infinite objective function." this happened with both Dual Quasi-Newton and Newton-Raphson with Ridging optimization techniques. If i am correct, the Newton-Raphson with Ridging technique makes PROC GLIMMIX almost similar to PROC MIXED. So this mean random _residual_ in PROC GLIMMIX or Repeated statement Repeated xtime/ subject=id type=sp (pow)(xtime) in PROC MIXED will not work? if so, then how to model covariance structure.

Regards

Sandhu

SteveDenham · Posted 12-28-2012 02:49 PM

If the model converged but the optimization stopped, I would look at the ddfm=kr option on the model statement as the most likely suspect, as this is a secondary process applied to the converged model. I am pretty sure that the unequal spacing leads to some problems with the full Kenward-Rogers procedure under unbalanced situations. You might be able to work around this with ddfm=kr(firstorder). If your data are balanced across sex and age, then you could try ddfm=satterthwaite. And if there are still problems, ddfm=bw, where bw is between-within.

Steve Denham

Sandhu · Posted 12-29-2012 02:05 AM

Dear Steve Denham,

There are couple of things which i feel are responsible for these problems.

When i was running the original quadratic code, i was getting G matrix not positive warning as follow:

proc glimmix data=data ORDER=Data ;

class id sex age ;

model vas = time time*time

sex age sex*age

sex*time age*time sex*age*time

sex*time*time age*time*time sex*age*time*time / ddfm=kr s cl ;

random int time time*time/ subject=id type=un v;

run;

NOTE: Convergence criterion (GCONV=1E-8) satisfied.

NOTE: At least one element of the gradient is greater than 1e-3.

NOTE: Estimated G matrix is not positive definite.

NOTE: PROCEDURE GLIMMIX used (Total process time):

real time 22.37 seconds

cpu time 22.12 seconds

Therefore, i used nobound option in the procedure as recommended (by Little 2006) for PROC MIXED procedure:

proc glimmix data=data ORDER=Data nobound ;

class id sex age ;

model vas = time time*time

sex age sex*age

sex*time age*time sex*age*time

sex*time*time age*time*time sex*age*time*time / ddfm=kr s cl ;

random int time time*time/ subject=id type=un v;

run;

It worked fine for quadrartic code even after adding nobound.

Now when i added random _residual_ statement in this above code along with nobound option, the problem of "Optimization stopped because of infinite objective function." as i mentioned in the previous post. This problem persisted even with ddfm=kr(firstorder), ddfm=satterthwaite and ddfm=bw .

But when i removed nobound, it worked even with ddfm=kr.

proc glimmix data=data2 ORDER=Data ;

class id sex age xtime;

model vas = time time*time

sex age sex*age

sex*time age*time sex*age*time

sex*time*time age*time*time sex*age*time*time / ddfm=kr s cl ;

random int time time*time/ subject=id type=un v;

random _residual_ / sub=id type=sp(pow)(xtime);

run;

Log is:

NOTE: Convergence criterion (GCONV=1E-8) satisfied.

NOTE: At least one element of the gradient is greater than 1e-3.

NOTE: Estimated G matrix is not positive definite.

NOTE: PROCEDURE GLIMMIX used (Total process time):

real time 23.30 seconds

cpu time 23.10 seconds

Now here is my dilemma : should i use nobound option and no random _residual_ statement OR should i use random _residual_ statement and no nobound option. IF i am not using nobound option, then how to justify the effect of Estimated G matrix is not positive definite.

These above codes were with default Optimization Technique i.e. Dual Quasi-Newton.

Now part two of the problem is changing Optimization Technique from Dual Quasi-Newton to Newton-Raphson with Ridging with nloptions tech= nrridge.

when i run same quadratic code (without random _residual_ statement) along with nobound option as :

proc glimmix data=data2 ORDER=Data nobound;

class id sex age xtime;

model vas = time time*time

sex age sex*age

sex*time age*time sex*age*time

sex*time*time age*time*time sex*age*time*time / ddfm=kr s cl ;

random int time time*time/ subject=id type=un v;

nloptions tech= nrridge; *Newton-Raphson with Ridging technique*;

run;

there is no difference in results for from previous Optimization Technique i.e Dual Quasi-Newton. The Fit criteria, Type III test of fixed effect are exactly same, so as the :

NOTE: Convergence criterion (GCONV=1E-8) satisfied.

NOTE: Estimated G matrix is not positive definite.

NOTE: PROCEDURE GLIMMIX used (Total process time):

real time 3.75 seconds

cpu time 3.71 seconds

This Newton-Raphson with Ridging technique also did not work with nobound option as it happened with Dual Quasi-Newton after i added random _residual_ statement.

BUT The problem occurs when i use nloptions tech= nrridge (i.e Newton-Raphson with Ridging technique) along with random _ residual_ statement but without nobound option.

proc glimmix data=data ORDER=Data ;

class id sex age xtime;

model vas = time time*time

sex age sex*age

sex*time age*time sex*age*time

sex*time*time age*time*time sex*age*time*time / ddfm=kr s cl ;

random int time time*time/ subject=id type=un v;

random _residual_ / sub=id type=sp(pow)(xtime);

nloptions tech= nrridge;

run;

now log is:

NOTE: Convergence criterion (XCONV=0) satisfied.

NOTE: At least one element of the gradient is greater than 1e-3.

NOTE: PROCEDURE GLIMMIX used (Total process time):

real time 10.43 seconds

cpu time 10.34 seconds

IN CONTRAST to log from same statement but with Dual Quasi-Newton technique, the above log (with Newton-Raphson with Ridging technique) shows No NOTE of Estimated G matrix is not positive definite and Convergence criterion changes from (GCONV=1E-8) to (XCONV=0). Another important differences are in Fit Statistics and Type III tests of fixed effects

Fit Statistics (from Dual Quasi-Newton technique)

-2 Res Log Likelihood 24037.76

AIC (smaller is better) 24053.76

AICC (smaller is better) 24053.80

BIC (smaller is better) 24082.84

CAIC (smaller is better) 24090.84

HQIC (smaller is better) 24065.42

Generalized Chi-Square 424340.5

Gener. Chi-Square / DF 126.74

Fit Statistics ( Newton-Raphson with Ridging technique)

-2 Res Log Likelihood 28283.79

AIC (smaller is better) 28299.79

AICC (smaller is better) 28299.83

BIC (smaller is better) 28328.87

CAIC (smaller is better) 28336.87

HQIC (smaller is better) 28311.45

Generalized Chi-Square 939474.5

Gener. Chi-Square / DF 280.61

Type III Tests of Fixed Effects (from Dual Quasi-Newton technique)

Num Den

Effect DF DF F Value Pr > F

TIME 1 1048 90.45 <.0001

TIME*TIME 1 2540 0.84 0.3598

SEX 1 131.9 4.35 0.0389

AGE 1 131.9 1.97 0.1626

SEX*AGE 1 131.9 5.22 0.0240

TIME*SEX 1 1048 0.03 0.8587

TIME*AGE 1 1048 0.00 0.9442

TIME*SEX*AGE 1 1048 0.09 0.7703

TIME*TIME*SEX 1 2540 0.10 0.7469

TIME*TIME*AGE 1 2540 0.01 0.9334

TIME*TIME*SEX*AGE 1 2540 0.00 0.9970

Type III Tests of Fixed Effects ( Newton-Raphson with Ridging technique)

Num Den

Effect DF DF F Value Pr > F

TIME 1 537 13.28 0.0003

TIME*TIME 1 3348 0.76 0.3819

SEX 1 135.9 0.10 0.7577

AGE 1 135.9 0.05 0.8315

SEX*AGE 1 135.9 0.12 0.7270

TIME*SEX 1 537 0.14 0.7074

TIME*AGE 1 537 0.03 0.8522

TIME*SEX*AGE 1 537 0.03 0.8736

TIME*TIME*SEX 1 3348 0.00 0.9580

TIME*TIME*AGE 1 3348 0.00 0.9760

TIME*TIME*SEX*AGE 1 3348 0.00 0.9762

WHY is such as dramatic difference between these two technique when incorporate covariance structure (i.e. random _residual_ statement) with spatial power structure. (I am not sure same effect will be observed if we use ar(1) or any other covariance strucute, BUT definitely it happened with sp(pow)(xtime) covariance structure) AND what is this criterion difference from (GCONV=1E-8) to (XCONV=0).

According to LITTLE 2006, Newton-Raphson with Ridging technique makes PROC GLIMMIX almost identical to PROC MIXED.

I really feel that i am asking very exhausting questions. Thanks for your patience and all the help.

Regards

Sandhu

SteveDenham · Posted 12-31-2012 08:21 AM

Without extensive investigation I can't guarantee that this is what is occurring, but I think the NRRIDG technique is finding a saddlepoint in the response surface. To investigate this, you might save the covariance parameter estimates from the QUANEW and use them as starting values.

But what is very intriguing is the large change in the denominator degrees of freedom. This is undoubtedly due to the Kenward-Rogers adjustment, and QUANEW resulting in at least one of the G-side covariance parameters being set to zero. Which parameter has a zero estimate? Can it be removed, and have the model still make sense? If it is the intercept term, consider that your R-side model may be exhausting all of the variation, leaving none to be estimated by the intercept. If this is the case, I think you can safely remove "int" from the first random statement. If not, could you please post all of the variance parameters from both techniques?

Steve Denham

Sandhu · Posted 01-03-2013 08:39 AM

Dear Steve Denham,

I can not figure it out that what is causing such a large change with use of these two techniques. I think you are the best person to understand it. Therefore, i am sending code, model information, dimensions, Optimization Information, LOG etc. Interestingly, When i remove intercept, the NRRIDG technique does not work and following error is reported ERROR: "NRRIDG Optimization cannot be completed. Optimization routine cannot improve the function value" whereas in same condition i.e. without intercept, QUANEW technique works and it converges.

A) CODE (NRRIDG with random intercept)

proc glimmix data=data2 ORDER=Data ;

class id sex age xtime;

model vas = time time*time

sex age sex*age

sex*time age*time sex*age*time

sex*time*time age*time*time sex*age*time*time / ddfm=kr s cl ;

random int time time*time/ subject=id type=un v;

random _residual_ / sub=id type=sp(pow)(xtime);

nloptions tech= NRRIDG;

run;

Model Information

Data Set WORK.DATA2

Response Variable VAS

Response Distribution Gaussian

Link Function Identity

Variance Function Default

Variance Matrix Blocked By ID

Estimation Technique Restricted Maximum Likelihood

Degrees of Freedom Method Kenward-Roger

Fixed Effects SE Adjustment Kenward-Roger

Dimensions

G-side Cov. Parameters 6

R-side Cov. Parameters 2

Columns in X 27

Columns in Z per Subject 3

Subjects (Blocks in V) 280

Max Obs per Subject 12

Optimization Information

Optimization Technique Newton-Raphson with Ridging

Parameters in Optimization 7

Lower Boundaries 4

Upper Boundaries 1

Fixed Effects Profiled

Residual Variance Profiled

Starting From Data

Iteration History

Objective Max

Iteration Restarts Evaluations Function Change Gradient

0 0 4 30662.837656 . 8506.067

1 0 2 29840.553145 822.28451046 4711.142

2 0 2 29369.643328 470.90981779 3582.741

3 0 2 28916.249747 453.39358027 3044.089

4 0 3 28783.541782 132.70796514 2944.675

5 0 2 28280.91676 502.62502243 8416.694

6 1 5 28751.288165 -470.3714047 7813.525

Convergence criterion (XCONV=0) satisfied.

Covariance Parameter Estimates

Standard

Cov Parm Subject Estimate Error

UN(1,1) ID 52.0689 .

UN(2,1) ID 1.0951 .

UN(2,2) ID 25.7652 .

UN(3,1) ID -2.3731 .

UN(3,2) ID -10.2573 .

UN(3,3) ID 12.3989 .

SP(POW) ID 0.5592 0.03979

Residual 280.30 32.7243

LOG

NOTE: Convergence criterion (XCONV=0) satisfied.

NOTE: At least one element of the gradient is greater than 1e-3.

NOTE: PROCEDURE GLIMMIX used (Total process time):

real time 10.49 seconds

cpu time 10.43 seconds

B) CODE (NRRIDG without random intercept)

proc glimmix data=data2 ORDER=Data ;

class id sex age xtime;

model vas = time time*time

sex age sex*age

sex*time age*time sex*age*time

sex*time*time age*time*time sex*age*time*time / ddfm=kr s cl ;

random time time*time/ subject=id type=un v;

random _residual_ / sub=id type=sp(pow)(xtime);

nloptions tech= NRRIDG;

run;

Model information remains same

Dimensions

G-side Cov. Parameters 3

R-side Cov. Parameters 2

Columns in X 27

Columns in Z per Subject 2

Subjects (Blocks in V) 280

Max Obs per Subject 12

Optimization Information

Optimization Technique Newton-Raphson with Ridging

Parameters in Optimization 4

Lower Boundaries 3

Upper Boundaries 1

Fixed Effects Profiled

Residual Variance Profiled

Starting From Data

Iteration History

Objective Max

Iteration Restarts Evaluations Function Change Gradient

0 0 4 29327.32795 . 11583.9

1 0 2 33917.845247 -4590.517296 11739.39

Optimization routine cannot improve the function value.

Covariance Parameter Estimates

Standard

Cov Parm Subject Estimate Error

UN(1,1) ID 17.6726 .

UN(2,1) ID 1.4510 .

UN(2,2) ID 33.5211 .

SP(POW) ID 0.9000 .

Residual 1415.97 .

LOG

ERROR: NRRIDG Optimization cannot be completed.

NOTE: Optimization routine cannot improve the function value.

NOTE: PROCEDURE GLIMMIX used (Total process time):

real time 1.43 seconds

cpu time 1.37 seconds

C) CODE (Dual Quasi-Newton with random intercept)

proc glimmix data=data2 ORDER=Data ;

class id sex age xtime;

model vas = time time*time

sex age sex*age

sex*time age*time sex*age*time

sex*time*time age*time*time sex*age*time*time / ddfm=kr s cl ;

random int time time*time/ subject=id type=un v;

random _residual_ / sub=id type=sp(pow)(xtime);

run;

Model information remains same

Dimensions

G-side Cov. Parameters 6

R-side Cov. Parameters 2

Columns in X 27

Columns in Z per Subject 3

Subjects (Blocks in V) 280

Max Obs per Subject 12

Optimization Information

Optimization Technique Dual Quasi-Newton

Parameters in Optimization 7

Lower Boundaries 4

Upper Boundaries 1

Fixed Effects Profiled

Residual Variance Profiled

Starting From Data

Iteration History

Objective Max

Iteration Restarts Evaluations Function Change Gradient

0 0 4 30662.837656 . 8506.067

1 0 5 28668.856677 1993.9809793 2199.816

2 0 58 26208.493502 2460.3631744 263595.8

3 0 5 26208.485885 0.00761685 270900.4

4 0 3 26188.347277 20.13860824 308956.5

5 0 4 26158.678073 29.66920398 616998

6 0 5 26140.803057 17.87501606 654371.3

7 0 5 26124.120621 16.68243649 688251

8 0 5 26107.310639 16.80998203 722089.4

9 0 5 26090.520122 16.79051661 755560.3

10 0 5 26073.888382 16.63173957 788085.5

11 0 5 26057.541672 16.34671061 818876.5

12 0 5 26041.594314 15.94735789 846943.3

13 0 5 26026.150001 15.44431333 871095.6

14 0 5 26011.302731 14.84726934 889940.6

15 0 2 25964.073338 47.22939308 11515841

16 1 20 25867.099209 96.97412875 635971.6

17 1 3 25828.607906 38.49130370 51872.11

18 1 11 25635.995456 192.61245002 6136433

19 1 7 25632.646006 3.34945002 13163228

20 1 6 25628.79396 3.85204595 13931469

21 1 3 25605.60941 23.18454998 10127958

22 1 3 25568.842337 36.76707290 4627900

.

80 2 5 23953.312146 0.00000207 63380.05

81 2 4 23953.311488 0.00065810 59514.87

82 2 3 23953.311486 0.00000236 53165.53

Convergence criterion (GCONV=1E-8) satisfied.

Estimated G matrix is not positive definite.

Covariance Parameter Estimates

Standard

Cov Parm Subject Estimate Error

UN(1,1) ID 9.86E-22 .

UN(2,1) ID -3.9026 0.8652

UN(2,2) ID 2.4122 0.2092

UN(3,1) ID 0.2268 0.06188

UN(3,2) ID -0.1050 0.008000

UN(3,3) ID 7.877E-8 .

SP(POW) ID 0.09958 0.01372

Residual 140.27 4.8014

LOG

NOTE: Convergence criterion (GCONV=1E-8) satisfied.

NOTE: At least one element of the gradient is greater than 1e-3.

NOTE: Estimated G matrix is not positive definite.

NOTE: PROCEDURE GLIMMIX used (Total process time):

real time 25.57 seconds

cpu time 25.30 seconds

D) CODE (Dual Quasi-Newton without random intercept)

proc glimmix data=data2 ORDER=Data ;

class id sex age xtime;

model vas = time time*time

sex age sex*age

sex*time age*time sex*age*time

sex*time*time age*time*time sex*age*time*time / ddfm=kr s cl ;

random time time*time/ subject=id type=un v;

random _residual_ / sub=id type=sp(pow)(xtime);

run;

Model information remains same

Dimensions

G-side Cov. Parameters 3

R-side Cov. Parameters 2

Columns in X 27

Columns in Z per Subject 2

Subjects (Blocks in V) 280

Max Obs per Subject 12

Optimization Information

Optimization Technique Dual Quasi-Newton

Parameters in Optimization 4

Lower Boundaries 3

Upper Boundaries 1

Fixed Effects Profiled

Residual Variance Profiled

Starting From Data

Iteration History

Objective Max

Iteration Restarts Evaluations Function Change Gradient

0 0 4 29327.32795 . 11583.9

1 0 5 29281.411327 45.91662314 12868.42

2 0 5 29240.295101 41.11622587 14297.88

3 0 70 26042.998139 3197.2969629 5.6897E9

4 0 4 25979.595376 63.40276297 9.877E8

5 0 3 25957.25277 22.34260529 1.139E10

6 0 3 25939.317553 17.93521689 1.29E10

7 0 18 25841.391645 97.92590789 7.1394E8

8 0 11 25819.989261 21.40238439 9.1343E8

9 0 10 25767.475697 52.51356388 1.5118E9

10 0 6 25765.64274 1.83295751 2.6814E9

11 0 4 25735.316611 30.32612824 3.7442E9

12 0 2 25707.023322 28.29328991 7.3018E9

13 0 2 25665.354323 41.66899876 1.3357E9

14 0 5 25649.042009 16.31231406 5.624E9

15 0 4 25599.39337 49.64863832 1.5771E9

16 0 3 25591.215318 8.17805200 4.4921E9

17 0 4 25575.47631 15.73900814 1.106E10

18 0 2 25550.179798 25.29651182 3.693E9

19 0 5 25522.241508 27.93829080 6.2534E9

20 0 3 25496.791043 25.45046438 6.5937E9

21 0 2 25471.721861 25.06918217 2.1984E9

22 0 9 25439.671707 32.05015405 2.8937E9

.

130 0 2 24125.065516 22.98137596 1.1381E8

131 0 3 24112.207432 12.85808339 1.2729E8

132 0 5 24107.301953 4.90547927 55729947

133 0 3 24106.054818 1.24713534 50910488

134 0 3 24105.477952 0.57686577 9767172

135 0 3 24105.407721 0.07023051 993324.6

136 0 3 24105.40601 0.00171169 106808.8

137 0 3 24105.405963 0.00004618 89331.43

Convergence criterion (GCONV=1E-8) satisfied.

Estimated G matrix is not positive definite.

LOG

NOTE: Convergence criterion (GCONV=1E-8) satisfied.

NOTE: At least one element of the gradient is greater than 1e-3.

NOTE: Estimated G matrix is not positive definite.

NOTE: PROCEDURE GLIMMIX used (Total process time):

real time 29.81 seconds

cpu time 29.12 seconds

I hope my presentation of above facts is clear. Please let me know if i can provide any additional information.

Thanks.

Regards,

Sandhu

SteveDenham · Posted 01-03-2013 10:32 AM

There is a reason that the default technique for optimization in GLIMMIX is the dual quasi-Newton method. It is a superior technique, and is finding an optimum solution, while the ridging method is not. You might be able to force the ridging technique by increasing the ridge parameter.

What is interesting to me is that two of the variances are essentially zero (UN(1,1) and UN(3,3)). This implies that the random effects are overspecified

Please try:

proc glimmix data=data2 ORDER=Data ;

class id sex age xtime;

model vas = time time*time

sex age sex*age

sex*time age*time sex*age*time

sex*time*time age*time*time sex*age*time*time / ddfm=kr s cl ;

random int time time*time/ subject=id type=chol v; /* Cholesky parameterization of the unstructured matrix */

random _residual_ / sub=id type=sp(pow)(xtime);

run;

If this still results in zeroes for the id and quadratic term variances, then I think some more thought needs to be given to whether the random _residual_ should be included or not. I think there is an internal contradiction between the specification of time as a continuous covariate with a random effect and a power law diminution of variation using the spatial power structure. In other words, we are overthinking the whole process. Perhaps a simpler approach will tell us all that we need.

Steve Denham

Muller · Posted 01-03-2013 03:23 PM

Dears Steve and Sandhu,

While I do not have access to your data to make more concrete suggestions, I am not clear why you need to structure the R-side covariance matrix to the extent of what you have "type=sp(pow)(xtime)". Is the outcome variable, pain assessment using 100 mm VAS scale, measured with a lot of error that could not be accounted for using a diagonal covariance matrix as in "random _residual_/ sub=id type=VC;"? I would go with a well structured G-side covariance matrix with the default R-side (you still have a non-diagonal V matrix that accounts for both with-in and between group covariation, BTW) for both model parsimony and computational simplicity purposes. Thanks.

Mulugeta

SteveDenham · Posted 01-03-2013 03:35 PM

Thanks for pulling us back from the brink. This is where the last two lines that I wrote was leading, but this is much more to the point, and WAAAAAY more likely to give interpretable results.

Steve Denham

Sandhu · Posted 01-03-2013 11:49 PM

Dear Steve and Mulugeta,

I tried both the alternatives as recommended by both of you. By using Cholesky parameterization of the unstructured matrix, the only the quadratic term variances is zero, and not the id. But the use of random _residual_/ sub=id type=VC did not improve the model and it still resulted in zeroes for the id and quadratic term variances.

A) using Steve's code with Cholesky parameterization of the unstructured matrix

Covariance Parameter Estimates

Standard

Cov Parm Subject Estimate Error

CHOL(1,1) ID 3.1811 0.4606

CHOL(2,1) ID -0.1228 0.1851

CHOL(2,2) ID 0.000026 0.1909

CHOL(3,1) ID -0.00994 0.01326

CHOL(3,2) ID -2.09E-6 0.01529

CHOL(3,3) ID 4.38E-18 .

SP(POW) ID -0.1072 0.01474

Residual 115.98 3.1069

Fit Statistics

-2 Res Log Likelihood 24954.65

AIC (smaller is better) 24968.65

AICC (smaller is better) 24968.68

BIC (smaller is better) 24994.09

CAIC (smaller is better) 25001.09

HQIC (smaller is better) 24978.85

Generalized Chi-Square 388284.9

Gener. Chi-Square / DF 115.98

it converged in 49 Iteration

Log:

NOTE: Convergence criterion (GCONV=1E-8) satisfied.

NOTE: At least one element of the gradient is greater than 1e-3.

NOTE: Estimated G matrix is not positive definite.

NOTE: PROCEDURE GLIMMIX used (Total process time):

real time 25.71 seconds

cpu time 25.36 seconds

B) using Mulugeta's code with diagonal covariance matrix as in "random _residual_/ sub=id type=VC and unstructured matrix UN variance for random effect

Covariance Parameter Estimates

Standard

Cov Parm Subject Estimate Error

UN(1,1) ID 3.23E-18 .

UN(2,1) ID 0.1491 0.9892

UN(2,2) ID 1.5762 0.1915

UN(3,1) ID -0.04960 0.07431

UN(3,2) ID -0.07712 0.007403

UN(3,3) ID 0.000025 .

Residual (VC) 149.52 3.9873

Fit Statistics

-2 Res Log Likelihood 25206.89

AIC (smaller is better) 25218.89

AICC (smaller is better) 25218.92

BIC (smaller is better) 25240.70

CAIC (smaller is better) 25246.70

HQIC (smaller is better) 25227.64

it converged in 53 iterations

LOG

NOTE: Convergence criterion (GCONV=1E-8) satisfied.

NOTE: At least one element of the gradient is greater than 1e-3.

NOTE: Estimated G matrix is not positive definite.

NOTE: PROCEDURE GLIMMIX used (Total process time):

real time 17.31 seconds

cpu time 17.05 seconds

B) using Mulugeta's code with diagonal covariance matrix as in "random _residual_/ sub=id type=VC and Cholesky parameterization of the unstructured matrix variance for random effect

Covariance Parameter Estimates

Standard

Cov Parm Subject Estimate Error

CHOL(1,1) ID 0.6879 0.8774

CHOL(2,1) ID 0.1156 0.2015

CHOL(2,2) ID 4.1E-19 .

CHOL(3,1) ID -0.01065 0.01730

CHOL(3,2) ID 0.000451 0.003211

CHOL(3,3) ID 2.13E-16 .

Residual (VC) 139.60 3.6281

Fit Statistics

-2 Res Log Likelihood 26165.98

AIC (smaller is better) 26175.98

AICC (smaller is better) 26176.00

BIC (smaller is better) 26194.15

CAIC (smaller is better) 26199.15

HQIC (smaller is better) 26183.27

Generalized Chi-Square 467391.5

Gener. Chi-Square / DF 139.60

it converged in 41 iterations

Log

NOTE: Convergence criterion (GCONV=1E-8) satisfied.

NOTE: At least one element of the gradient is greater than 1e-3.

NOTE: Estimated G matrix is not positive definite.

NOTE: PROCEDURE GLIMMIX used (Total process time):

real time 3.25 seconds

cpu time 3.19 seconds

Thanks

Regards

Sandhu

SteveDenham · Posted 01-04-2013 07:38 AM

It is no great crime if a variance component is zero (G matrix is not positive definite). It simply means that for the data available, other terms explain all of the variation. So long as the proper adjustment to degrees of freedom is applied, tests and confidence intervals should be acceptable.

Given all that has gone on here, it appears that the Cholesky parameterization leads to the model with the smallest information criteria, so I would just go with that. It implies that after the individual variability, linear in time variability, their covariance, and their covariance with quadratic in time are accounted for, you have no remaining variability due strictly to the quadratic in time component. So long as you feel that the covariances are important, then this is where you end up. With a larger or different data set, you might get something estimated for the quadratic component.

If you believe, a priori, that the covariances are not important, then you should steer away from the unstructured covariance types, and use the diagonal structure that Muller proposed.

Steve Denham

Sandhu · Posted 01-04-2013 07:52 AM

Thanks a lot, Steve. I will do that.

Regrads

Sandhu

Catch up on SAS Innovate 2026