BookmarkSubscribeRSS Feed
G_Le_Teuff
Calcite | Level 5

Dear all,

I performed a Poisson regression model on large database with only one categorical variable (0,1,2) (simple model) considered in the class statement or using 2 dummies variables and I was suprised to observe that the intercept estimate is different. Please find below the SAS code of 3 "identical" models but given different results. The reference is 2.

proc freq data=temp;

tables trt_new*trt1*trt2 / list;

run;

trt_new    trt1    trt2    Frequency     Percent     Frequency      Percent

          0       1       0       44203       55.22         44203        55.22

          1       0       1       10315       12.89         54518        68.11

          2       0       0       25531       31.89         80049       100.00

Model 1

proc glimmix data=temp;

class trt_new;

model status = trt_new / dist=poisson link=log offset=ln_y s;

run;

  Effect       trt_new    Estimate       Error       DF    t Value    Pr > |t|

  Intercept                -1.4075     0.01379    80046    -102.10      <.0001

  trt_new      0           -0.1615     0.01780    80046      -9.07      <.0001

  trt_new      1           -0.1679     0.02712    80046      -6.19      <.0001

  trt_new      2                 0           .        .        .         .

Model 2

proc glimmix data=temp;

class  trt1 trt2;

model status = trt1 trt2   / dist=poisson link=log offset=ln_y s;

run;

  Effect       trt1    trt2    Estimate       Error       DF    t Value    Pr > |t|

  Intercept                     -1.7369     0.02937    80046     -59.14      <.0001

  trt1         0                 0.1615     0.01780    80046       9.07      <.0001

  trt1         1                      0           .        .        .         .

  trt2                 0         0.1679     0.02712    80046       6.19      <.0001

  trt2                 1              0           .        .        .         .

Model 3

proc glimmix data=temp;

model status = trt1 trt2/ dist=poisson link=log offset=ln_y s;

run;

Effect       Estimate       Error       DF    t Value    Pr > |t|
Intercept     -1.4075     0.01379    80046    -102.10      <.0001

trt1          -0.1615     0.01780    80046      -9.07      <.0001

trt2          -0.1679     0.02712    80046      -6.19      <.0001

Have you an explanation why the intercept of model 2 is different from that of model 1 and 3 ?  I would like extend these models in adding other categorical data but I do not known whehter it is better to use class or dummy variable.

In addition how can you define the class of reference because param=ref and ref=first do not exist with glimmix proc. Thank in advance.

Gwénaël

10 REPLIES 10
PaigeMiller
Diamond | Level 26

If you created the dummy variables properly, then the only difference is the model is parameterized differently, and so it is possible that the intercept changes, but the model is equivalent when the difference in parameterization is taken into account, and your predicted values should also be the same.

Your three examples seem to be providing the exact same estimates once this difference in parameterization is taken into account

Model 1

If Trt_new=0, then the effect is –1.4075–0.1615 = –1.569


Model 2

If Trt1=1 and trt2=0, which is the same as Trt_new=0 then the effect is –1.7369+0+0.1679 = –1.569, exact same effect

--
Paige Miller
SteveDenham
Jade | Level 19

In addition to the models you present (which has shown to be equivalent), you could use other options (REF=) in the CLASS statement to set which level is a reference group.

I prefer to let the PROC generate what I need, using the CLASS statement, as opposed to hand coding dummy variables.  Much less chance of a surprise due to improper levels (at least for me).

Steve Denham

G_Le_Teuff
Calcite | Level 5

For the model 2, I used a order=formatted to obtain the same results because REF did not exist in proc glimmix SAS 9.3.

I used dummy variables for treatment variable because it seems that the first following code is faster and converge compared to removing the 2 randoms statement by one (which not converge).

proc  glimmix data=temp;

class fu study meta;

model status = fu study trt1 trt2 / dist=poisson link=log offset=ln_y s;

random trt1 / subject=study type=vc;

random trt2 / subject=study type=vc;

covtest ZeroG;

run;

proc  glimmix data=temp;

class fu study meta trt_new;

model status = fu study trt_new/ dist=poisson link=log offset=ln_y s;

random trt_new / subject=study type=vc group=meta;

covtest ZeroG;

run;

The objective is to insert a rando effect for treatment effect (A/B) in metaanalysis 1 and a random effect for treatment effect (A/C) in metaanalysis  2.

I am trying the following options to resolve the issue of convergent but it does not work in the second program.

nloptions technique=congra maxiter=1000 gconv=1e-4;

Could you tell me what method used for the estimation in method=PL, QUAD, RSMPL, ....

thanks for your advice.

Gwénaël

PaigeMiller
Diamond | Level 26

Its hard to see what this (except for your first sentence) has to do with your original question.

Could you tell me what method used for the estimation in method=PL, QUAD, RSMPL, ....

Actually, no I cannot, but the SAS Help files have extensive documentation on these methods.

--
Paige Miller
SteveDenham
Jade | Level 19

For the models that you are examining, use METHOD=QUAD

The code I would try first would be:

proc  glimmix data=temp method=quad;

class fu study meta trt_new;

model status = fu study trt_new/ dist=poisson link=log offset=ln_y s;

random trt_new / subject=study type=vc group=meta;

covtest ZeroG;

run;

Steve Denham

G_Le_Teuff
Calcite | Level 5

Thank you very much. I agree that the second question about the convergence is not directly related to the first question but it seem that the coding (class statement or dummy variable) impact the modelling. As suggested by Steve Denham I used

1. method=quad without noptions but the following message occurs

NOTE: Convergence criterion (GCONV=1E-8) satisfied.

NOTE: At least one element of the gradient is greater than 1e-3.

NOTE: Estimated G matrix is not positive definite.

and the results of covariance parameter are

                                    Covariance Parameter Estimates

                       Cov                                         Standard

                       Parm       Subject    Group     Estimate       Error

                         trt_new    study      meta 1           0           .

                         trt_new    study      meta 2           0           .

2. if I add the noptions technique=congra maxiter=1000 gconv=1e-4;

the following message occurs

ERROR: Floating Point Overflow.

ERROR: Termination due to Floating Point Exception

3. while the following program

proc glimmix data=temp maxopt=400 pconv=1e-4

class fu study meta;

model status = fu study trt1 trt2 / dist=poisson link=log offset=ln_y s ;

random trt1 / subject=study type=vc;

random trt2 / subject=study type=vc;

nloptions technique=congra maxiter=1000 gconv=1e-4;

covtest ZeroG;

run;

gives

    Covariance Parameter Estimates

                              Cov                            Standard

                               Parm    Subject    Estimate       Error

                               trt1    study       0.02702     0.03036

                               trt2    study       0.01845    0.009520

Thank you for your help.

Gwénaël

SteveDenham
Jade | Level 19

Thanks, Gwénaël.  I have added something new to my toolbox when fighting nonconvergence problems!  This could become quite helpful.

Steve Denham

G_Le_Teuff
Calcite | Level 5

Steve,

How can I access to your toolbox.

Gwénaël

SteveDenham
Jade | Level 19

Brain surgery, I suppose.Smiley Wink

I should have been clearer--my toolbox is figurative, not literal.  A lot of it consists of persistence with Google, reading SAS-L daily for <mumble-mumble> years and just trying different things with existing datasets, where I have a clear idea of how the data were generated.  And new tools come from reading this forum, seeing folks ask questions, trying to answer them, and then seeing how they take answers and use them to interpret what they are doing.

Steve Denham

G_Le_Teuff
Calcite | Level 5

Is it possible to transfer some information from your brain to my brain...

yes I understand. I read different exchanges between you and other people about the different issues of convergence when using proc glimmix.

Thank you again

Gwénaël

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 10436 views
  • 0 likes
  • 3 in conversation