Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Proc glimmix random intercepts and slopes with correlated residuals

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 09-29-2018 02:21 PM
(2358 views)

I am modeling the probability of a child being retained in kindergarten using the ECLS-K 2011 dataset. The model includes random intercepts and slopes. I'd like to allow the cluster-level residuals to be correlated. Here is my current code:

`proc glimmix; `

`class s2_ID; `

`model Retained (event=last) = X2RTHETK1_cwc X2MTHETK1_cwc S2NMRETK_gmc x1ageent_cwc/cl dist=binary link=logit solution ; `

`random intercept X2RTHETK1_cwc X2MTHETK1_cwc/ subject=s2_id ; `

`run;`

X2RTHETK1_cwc and X2MTHETK1_cwc are reading and math achievement in kindergarten (centered within clusters), S2NMRETK_gmc is the number of students retained the prior year (grand-mean centered), and x1ageent_cwc is the age at kindergarten entry (centered within clusters).

I'm not sure which options I need to include and the ones I have tried have resulted in errors. I have tried changing the covariance structure to type=un but am getting this error "Estimated G matrix is not positive definite." I've also tried adding a random _residual_ statement but cannot get the model to converge using this approach. What options make the most sense for what I'm trying to do? I'm using SAS 9.4.

3 REPLIES 3

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

More detail about your study design would be helpful.

I might assume that a cluster (s2_id?) is a classroom with students nested within. I might assume that X2RTHETK1, X2MTHETK1, and x1ageent are student-level variables (that students are the sampling units for these variables). The sampling unit for S2NMRETK is not clear to me. And my assumptions may be incorrect: I am not familiar with the ECLS-K 2011 dataset.

Are your clusters independent? If not, in what way are they dependent?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks so much for your response. Please see my answers below:

I might assume that a cluster (s2_id?) is a classroom with students nested within.

s2_id is the school.

I might assume that X2RTHETK1, X2MTHETK1, and x1ageent are student-level variables (that students are the sampling units for these variables).

Yes, these are student-level variables: reading score, math score, and age at kindergarten entry.

The sampling unit for S2NMRETK is not clear to me. And my assumptions may be incorrect: I am not familiar with the ECLS-K 2011 dataset.

S2NMRETK is a school-level variable that indicates the number retained in kindergarten in the school the prior school year.

Are your clusters independent? If not, in what way are they dependent?

Clusters are independent

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

So, you have students nested within schools. *X2RTHETK1*, *X2MTHETK1*, and *x1ageent* are student-level variables. *S2NMRETK * is a school-level variable. I presume *Retained* is binary.

If your clusters (schools) are independent, then there is no need "to allow the cluster-level residuals to be correlated". I think what you might be thinking is to allow nonzero covariances among the random intercept and random slopes for *X2RTHETK1_cwc* and *X2MTHETK1_cwc *which would be accomplished by

```
random intercept X2RTHETK1_cwc X2MTHETK1_cwc/ subject=s2_id type=un;
```

The "Estimated G matrix is not positive definite." message occurs because one or more variances/covariances have been set to zero, which might be due to the (co)variance being very small or to inadequate data support for its estimation or to an estimation method that is less optimal (binary response data can be problematic). You could try various adjustments to the model such as using Laplace estimation; see the papers by Kiernan, Tao, and Gibbs (2012) and Tao, Kiernan and Gibbs (2015) for this and other ideas.

With binary response data, you will not need a random _residual_ statement.

If you have not already done so, you would probably find value in reading in detail about multilevel modeling; there is an extensive list or resources here.

I hope this helps.

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.