Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Multilevel analysis with a samll number of events using the PROC GLIMM...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 08-20-2018 07:46 PM
(1454 views)

Hello,

I am doing a multilevel level analysis using the PROC GLIMMIX procedure in SAS. The data I got is a categorical data with a very small number of events. The data were collected using multi-stage cluster sampling procedures, i.e., individuals were nested within clusters, and clusters were nested within regions. The total sample size is 6954 and the events occurred are 254, which is only 3.65% of the total samples.

Since the number of events are very small, I am getting an error message when I run it. Is there another possible statistical method that can accommodate it? I really appreciate your advice and support in this regard.

The sample code I used is below:

**proc** **glimmix** data = home.caesarean2 method = laplace;

class region hregion;

model M17_1 (event = last) = v012 v106 v717 v501 v701 v705 v150h v136 v190 v130 v743a v201 m14_1 m10_1 v212

FACTYPE distance csa1 csa2 csr1 csr2 gr1 gr2 hfma v025

HREGION/s cl dist = binary link = logit;

random intercept v012 v106 v717 v501 v701 v705 v150h v136 v190 v130 v743a v201 m14_1 m10_1 v212

FACTYPE distance csa1 csa2 csr1 csr2 gr1 gr2 hfma v025 /subject = region cl s type = vc;

random intercept v012 v106 v717 v501 v701 v705 v150h v136 v190 v130 v743a v201 m14_1 m10_1 v212

/subject = v001 (region) cl s type = vc;

covtest / wald;

**run**;

Kind regards

Teketo

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

When you get an error it is very helpful to copy the code and error message(s) both from the log and paste into a code box opened using the forum's {I} icon.

As a minimum all CLASS variables must appear in the MODEL statement. I don't see REGION appearing on the model statement but is on the class statement.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

As @ballardw says, it would be *very helpful* to have details about the nature of the errors.

Your model is very ambitious: it specifies a lot of parameters to estimate, and I would not be surprised if you've just run out of data to support all that estimation.

GLIMMIX allows SUBJECTs in RANDOM statements to be continuous rather than classification, but see https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_glimmix_sec... where it says

**SUBJECT= effect**

**SUB= effect**

identifies the subjects in your generalized linear mixed model. Complete independence is assumed across subjects. Specifying a subject effect is equivalent to nesting all other effects in the RANDOM statement within the subject effect.

Continuous variables and computed variables are permitted with the SUBJECT= option. PROC GLIMMIX does not sort by the values of the continuous variable but considers the data to be from a new subject whenever the value of the continuous variable changes from the previous observation. Using a continuous variable can decrease execution time for models with a large number of subjects and also prevents the production of a large "Class Levels Information" table.

So if REGION and V001(REGION) are not sorted appropriately, you could be specifying too many subjects.

I think your specification of random slopes at both REGION and V001(REGION) levels could be incorrect. I would think that you would use *means* (computed over the V001 levels within each REGION) as predictor variables at the REGION level. The paper by Judith Singer does a good job of developing this idea; there are lots of other resources as well, of course.

In addition to considering a different model specification and understanding more about what you are attempting, I would start simply and build up--in other words, do not throw all of the predictors into the model at once. With this many continuous predictor variables, assessing the linearity assumption will be a challenge.

I hope this helps.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

Many thanks.

I did include all the class variables in the model statement.

The SUBJECTs in the RANDOM statement are discrete, i.e. REGION ranges from 1 to 11 and V001 from 1 to 622.

SAS stop processing the procedure; I am getting the following error message:

{

proc glimmix data = cs.caesarean2 method = laplace;

class region hregion;

model M17_1 (event = last) = v012 v106 v717 v501 v701 v705 v150h v136 v190 v130 v743a v201

m14_1 m10_1 v212

FACTYPE distance csa1 csa2 csr1 csr2 gr1 gr2 hfma v025

region HREGION/s cl dist = binary link = logit;

random intercept v012 v106 v717 v501 v701 v705 v150h v136 v190 v130 v743a v201 m14_1 m10_1 v212

FACTYPE distance csa1 csa2 csr1 csr2 gr1 gr2 hfma v025 /subject = region cl s type = vc;

random intercept v012 v106 v717 v501 v701 v705 v150h v136 v190 v130 v743a v201 m14_1 m10_1 v212

/subject = v001 (region) cl s type = vc;

covtest / wald;

run;

NOTE: The GLIMMIX procedure is modeling the probability that M17_1='1'.

ERROR: The SAS System stopped processing this step because of insufficient memory.

NOTE: PROCEDURE GLIMMIX used (Total process time):

real time 7.40 seconds

cpu time 6.61 seconds

}

Kind regards

Teketo

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Ah, yes, you do have REGION as a CLASS variable, my apologies. But V001 is not in CLASS, so sorting would be a necessary concern.

Still, your model is attempting to estimate 622 random effects (i.e., one slopes for EACH level of V001) for EACH continuous predictor variable (of which there are 15). That is a lot, a lot of parameter estimates: even if you had enough data, you don't have enough memory. I'm still thinking that you are asking way too much of your model, and that you need to ponder what statistical model mirrors your experimental design, what you want, and what is possible. Push back from the keyboard, and give it some thought.

Are you ready for the spotlight? We're accepting content ideas for **SAS Innovate 2025** to be held May 6-9 in Orlando, FL. The call is **open **until September 25. Read more here about **why** you should contribute and **what is in it** for you!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.