BookmarkSubscribeRSS Feed
Kastchei
Pyrite | Level 9

Hi there,

 

I have a multinomial model with repeated measures that I was trying to model using GLIMMIX.  The outcome is actually ordinal (0 = none, 1 = low, 2 = medium, 3 = high), but it does not seem to meet the proportional odds assumption for a cumulative logit model.  I switched to a generalized logit model instead.  The problem is that the switch has made the model too large to run.  I get this error:

 

ERROR: Model is too large to be fit by PROC GLIMMIX in a reasonable amount of time on this system. Consider changing your model.

 

Here is the syntax:

 

proc glimmix data = cogFunc._07_Model;
    class ID outcomeClass (ref = 'None/minimal') exposureC (ref = '1 = Low');
    model outcomeClass = exposureC / distribution = multinomial link = gLogit solution cl intercept;
    random intercept / subject = ID group = outcomeClass;
    nLOptions maxIter = 5000;
run;

As you can see, it's already a simple model, so I don't think there's any way for me to change the model.  GLIMMIX will not model R-side effects for multinomial models, hence the G-side intercepts.  GENMOD does not do generalized logit models, only cumulative logit models.  I did try using numeric variables instead of character variables; I saw that suggested in another thread, but it did not help.

 

proc glimmix data = cogFunc._07_Model;
    class ID outcomeClassN (ref = '0') exposureN (ref = '1');
    model outcomeClassN = exposureN / distribution = multinomial link = gLogit solution cl intercept;
    random intercept / subject = ID group = outcomeClassN;
    nLOptions maxIter = 5000;
run;

I have also tried different methods (laplace, quad), but the procedure seems to stop way before any actual estimating occurs.  It seems to get to the dimensions and just throw its hands up!  I have all my SAS memory options maxed, and I have 24 GB of RAM.

 

I think the problem is that I have over 16k subjects.  Here's the partial output I receive.

 

The GLIMMIX Procedure

Model Information
Data Set COGFUNC._07_MODEL
Response Variable outcomeClass
Response Distribution Multinomial (nominal)
Link Function Generalized Logit
Variance Function Default
Variance Matrix Blocked By ID
Estimation Technique Residual PL
Degrees of Freedom Method Containment


Class Level Information
Class Levels Values
ID 16380 not printed
outcomeClass 4 High Low Medium None/minimal
exposureC 4 2 = Intermediate 3 = High 4 = Very high 1 = Low


Number of Observations Read 58575
Number of Observations Used 58575


Response Profile
Ordered
Value
outcomeClass Total
Frequency
1 High 4103
2 Low 23846
3 Medium 7283
4 None/minimal 23343
In modeling category probabilities, outcomeClass='None/minimal' serves as the reference category.


Dimensions
G-side Cov. Parameters 4
Columns in X 15
Columns in Z per Subject 4
Subjects (Blocks in V) 16380
Max Obs per Subject 9

 

Reading another thread about this error, the suggestion was to use PROC GEE instead of GLIMMIX so that the subject intercepts would not have to be estimated.  I am not familiar with PROC GEE.  Does this code look reasonable?  The repeated measures are not at constant time intervals nor the same intervals for each subject nor the same number of measurements per subject: e.g. subject 1 could have 3 measurements at day 47, 496, and 10345, whereas subject 2 could have 4 measurements at 365, 849, 3495, and 9231.

 

 

proc gee data = cogFunc._07_Model descending;
    class ID outcomeClass exposureC (ref = '1 = Low');
    model outcomeClass = exposureC / dist = multinomial link = gLogit type3;
    repeated subject = ID;
run;
GEE cannot fit anything other than type = independent for the repeated statement, so I guess I'm stuck with that.  It also tells me that it cannot calculate Type III statistics, so I don't get an overall test of the exposure factor.  Are there any options that I should be using to get more information out of PROC GEE?  Is this even a correct procedure to use in my case?
 

Thanks in advance for any help.

 

Warm regards,

Michael

4 REPLIES 4
StatDave
SAS Super FREQ

Another approach you could use is a non-modeling approach via the CMH option in PROC FREQ. This should provide an overall test of your exposure variable. For example:

proc freq;
table id*exposurec*outcomeclass / noprint cmh;
run;

You will want the NOPRINT option to avoid printing a table for every ID level. The results will show three tests. Since your response is ordinal, you will probably want to use one of the first two depending on the nature of your exposure variable. If it is ordinal, use the first statistic. If it is nominal, use the second. If your response is binary, the first and second will be the same.

Kastchei
Pyrite | Level 9

Thanks, Stat Dave!  I had forgotten about CMH.  I will at some point need a working model, because exposure is just one of several variables that will be adjusted for (e.g. demographics, some medical conditions, count of medications taken).  However, I certainly can use CMH when looking at each variable one at a time vs. the outcome.

MichaelL_SAS
SAS Employee

One small comment, I believe PROC GEE does support Type III tests for generalized logit models using the Wald test statistic, and it supports Type III tests using either the generalized score statistic or the Wald test statistic for ordinal response models. To request the Wald tests you can specify the Wald option in the MODEL statement. 

StatDave
SAS Super FREQ

To be clear, to get type3 tests you will need to specify both the TYPE3 and the WALD options in the MODEL statement.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1085 views
  • 5 likes
  • 3 in conversation