Statistical Procedures

Programming the statistical procedures from SAS
BookmarkSubscribeRSS Feed
SHINAR
Calcite | Level 5

I have a repeated-measurement data including 1.8 million observations, and each individual contains more than one recording. I wonder to explore the relationship between outcome and exposure using PROC GLIMMIX, like this:

 

proc glimmix data=mydata;

      class id stage;

      model outcome=exposure confoundings / solution cl;

     random int / subject=id(stage);

run;

 

 

But, SAS log specified an error that is "Model is too large to be fit by PROC GLIMMIX in a reasonable amount of time on this system. Consider changing your model." 

I want to know if could I use PROC GENMOD+REPEATED instead to fit my model.

 

proc genmod data=mydata;

    class id source;

    model outcome=exposure confoundings ;

    repeated subject=id /within=stage;

run;

 

7 REPLIES 7
SteveDenham
Jade | Level 19

First, is what sort of variable is 'outcome'?  I don't see a specification of a distribution in either the code for GLIMMIX or GENMOD, which means the default distribution is Gaussian.  If that is the case, then consider using PROC HPMIXED. The overview in the documentation (here ) makes it look like your situation is what HPMIXED was designed for (large number of observations).

 

Code would resemble this (to start)

 

proc hpmixed data=mydata;
      class id stage;
      model outcome=exposure confoundings / solution cl;
      random int / subject=id(stage);
      test exposure confoundings:
run;

Depending on the nature of exposure and confoundings, they may need to be added to the CLASS statement.

 

SteveDenham

 

SHINAR
Calcite | Level 5

Yes, the distribution is Gaussian, and thank you for you help. 

Ksharp
Super User
I think you are right. GLMM is for marginal effect , while GEE is for population effect, they both have the similar output .

@lvm @StatDave
SteveDenham
Jade | Level 19

Plus, we have yet to recognize the repeated measures nature of the data with an appropriate REPEATED statement. The current version only deals with a G-side random slope.  The variable that indexes the repeated measures has not been mentioned yet.

 

SteveDenham

SHINAR
Calcite | Level 5

Thank you for you help. I have read some articles about this issue, and GEE with repeated method seems more suitable for my data. 

SteveDenham
Jade | Level 19

If you run into memory issues with PROC GEE, then you have this to fall back on.

 

SteveDenham

StatDave
SAS Super FREQ

If you want to fit a GEE model, I suggest using the newer PROC GEE rather than PROC GENMOD. Same syntax.

sas-innovate-white.png

Join us for our biggest event of the year!

Four days of inspiring keynotes, product reveals, hands-on learning opportunities, deep-dive demos, and peer-led breakouts. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1679 views
  • 2 likes
  • 4 in conversation