Re: GENMOD + REPEATED for fitting model with repeated measurement data

SHINAR · Posted 08-24-2020 10:00 PM

I have a repeated-measurement data including 1.8 million observations, and each individual contains more than one recording. I wonder to explore the relationship between outcome and exposure using PROC GLIMMIX, like this:

proc glimmix data=mydata;

      class id stage;

      model outcome=exposure confoundings / solution cl;

     random int / subject=id(stage);

run;

But, SAS log specified an error that is "Model is too large to be fit by PROC GLIMMIX in a reasonable amount of time on this system. Consider changing your model."

I want to know if could I use PROC GENMOD+REPEATED instead to fit my model.

proc genmod data=mydata;

    class id source;

    model outcome=exposure confoundings ;

    repeated subject=id /within=stage;

run;

SteveDenham · Posted 08-25-2020 08:23 AM

First, is what sort of variable is 'outcome'? I don't see a specification of a distribution in either the code for GLIMMIX or GENMOD, which means the default distribution is Gaussian. If that is the case, then consider using PROC HPMIXED. The overview in the documentation (here ) makes it look like your situation is what HPMIXED was designed for (large number of observations).

Code would resemble this (to start)

proc hpmixed data=mydata;
      class id stage;
      model outcome=exposure confoundings / solution cl;
      random int / subject=id(stage);
      test exposure confoundings:
run;

Depending on the nature of exposure and confoundings, they may need to be added to the CLASS statement.

SteveDenham

SHINAR · Posted 08-25-2020 09:43 AM

Yes, the distribution is Gaussian, and thank you for you help.

Ksharp · Posted 08-25-2020 08:32 AM

I think you are right. GLMM is for marginal effect , while GEE is for population effect, they both have the similar output .

@lvm @StatDave

SteveDenham · Posted 08-25-2020 08:36 AM

Plus, we have yet to recognize the repeated measures nature of the data with an appropriate REPEATED statement. The current version only deals with a G-side random slope. The variable that indexes the repeated measures has not been mentioned yet.

SteveDenham

SHINAR · Posted 08-25-2020 09:50 AM

Thank you for you help. I have read some articles about this issue, and GEE with repeated method seems more suitable for my data.

SteveDenham · Posted 08-26-2020 09:01 AM

If you run into memory issues with PROC GEE, then you have this to fall back on.

SteveDenham

StatDave · Posted 08-25-2020 10:05 AM

If you want to fit a GEE model, I suggest using the newer PROC GEE rather than PROC GENMOD. Same syntax.

SAS Innovate 2025: Call for Content