Hi there,
I have a multinomial model with repeated measures that I was trying to model using GLIMMIX. The outcome is actually ordinal (0 = none, 1 = low, 2 = medium, 3 = high), but it does not seem to meet the proportional odds assumption for a cumulative logit model. I switched to a generalized logit model instead. The problem is that the switch has made the model too large to run. I get this error:
ERROR: Model is too large to be fit by PROC GLIMMIX in a reasonable amount of time on this system. Consider changing your model.
Here is the syntax:
proc glimmix data = cogFunc._07_Model;
class ID outcomeClass (ref = 'None/minimal') exposureC (ref = '1 = Low');
model outcomeClass = exposureC / distribution = multinomial link = gLogit solution cl intercept;
random intercept / subject = ID group = outcomeClass;
nLOptions maxIter = 5000;
run;
As you can see, it's already a simple model, so I don't think there's any way for me to change the model. GLIMMIX will not model R-side effects for multinomial models, hence the G-side intercepts. GENMOD does not do generalized logit models, only cumulative logit models. I did try using numeric variables instead of character variables; I saw that suggested in another thread, but it did not help.
proc glimmix data = cogFunc._07_Model;
class ID outcomeClassN (ref = '0') exposureN (ref = '1');
model outcomeClassN = exposureN / distribution = multinomial link = gLogit solution cl intercept;
random intercept / subject = ID group = outcomeClassN;
nLOptions maxIter = 5000;
run;
I have also tried different methods (laplace, quad), but the procedure seems to stop way before any actual estimating occurs. It seems to get to the dimensions and just throw its hands up! I have all my SAS memory options maxed, and I have 24 GB of RAM.
I think the problem is that I have over 16k subjects. Here's the partial output I receive.
The GLIMMIX Procedure
Model Information
Data Set
COGFUNC._07_MODEL
Response Variable
outcomeClass
Response Distribution
Multinomial (nominal)
Link Function
Generalized Logit
Variance Function
Default
Variance Matrix Blocked By
ID
Estimation Technique
Residual PL
Degrees of Freedom Method
Containment
Class Level Information
Class
Levels
Values
ID
16380
not printed
outcomeClass
4
High Low Medium None/minimal
exposureC
4
2 = Intermediate 3 = High 4 = Very high 1 = Low
Number of Observations Read
58575
Number of Observations Used
58575
Response Profile
Ordered Value
outcomeClass
Total Frequency
1
High
4103
2
Low
23846
3
Medium
7283
4
None/minimal
23343
In modeling category probabilities, outcomeClass='None/minimal' serves as the reference category.
Dimensions
G-side Cov. Parameters
4
Columns in X
15
Columns in Z per Subject
4
Subjects (Blocks in V)
16380
Max Obs per Subject
9
Reading another thread about this error, the suggestion was to use PROC GEE instead of GLIMMIX so that the subject intercepts would not have to be estimated. I am not familiar with PROC GEE. Does this code look reasonable? The repeated measures are not at constant time intervals nor the same intervals for each subject nor the same number of measurements per subject: e.g. subject 1 could have 3 measurements at day 47, 496, and 10345, whereas subject 2 could have 4 measurements at 365, 849, 3495, and 9231.
proc gee data = cogFunc._07_Model descending;
class ID outcomeClass exposureC (ref = '1 = Low');
model outcomeClass = exposureC / dist = multinomial link = gLogit type3;
repeated subject = ID;
run;
GEE cannot fit anything other than type = independent for the repeated statement, so I guess I'm stuck with that. It also tells me that it cannot calculate Type III statistics, so I don't get an overall test of the exposure factor. Are there any options that I should be using to get more information out of PROC GEE? Is this even a correct procedure to use in my case?
Thanks in advance for any help.
Warm regards,
Michael
... View more