Dear all,
I performed a Poisson regression model on large database with only one categorical variable (0,1,2) (simple model) considered in the class statement or using 2 dummies variables and I was suprised to observe that the intercept estimate is different. Please find below the SAS code of 3 "identical" models but given different results. The reference is 2.
proc freq data=temp;
tables trt_new*trt1*trt2 / list;
run;
trt_new trt1 trt2 Frequency Percent Frequency Percent
0 1 0 44203 55.22 44203 55.22
1 0 1 10315 12.89 54518 68.11
2 0 0 25531 31.89 80049 100.00
Model 1
proc glimmix data=temp;
class trt_new;
model status = trt_new / dist=poisson link=log offset=ln_y s;
run;
Effect trt_new Estimate Error DF t Value Pr > |t|
Intercept -1.4075 0.01379 80046 -102.10 <.0001
trt_new 0 -0.1615 0.01780 80046 -9.07 <.0001
trt_new 1 -0.1679 0.02712 80046 -6.19 <.0001
trt_new 2 0 . . . .
Model 2
proc glimmix data=temp;
class trt1 trt2;
model status = trt1 trt2 / dist=poisson link=log offset=ln_y s;
run;
Effect trt1 trt2 Estimate Error DF t Value Pr > |t|
Intercept -1.7369 0.02937 80046 -59.14 <.0001
trt1 0 0.1615 0.01780 80046 9.07 <.0001
trt1 1 0 . . . .
trt2 0 0.1679 0.02712 80046 6.19 <.0001
trt2 1 0 . . . .
Model 3
proc glimmix data=temp;
model status = trt1 trt2/ dist=poisson link=log offset=ln_y s;
run;
Effect Estimate Error DF t Value Pr > |t|
Intercept -1.4075 0.01379 80046 -102.10 <.0001
trt1 -0.1615 0.01780 80046 -9.07 <.0001
trt2 -0.1679 0.02712 80046 -6.19 <.0001
Have you an explanation why the intercept of model 2 is different from that of model 1 and 3 ? I would like extend these models in adding other categorical data but I do not known whehter it is better to use class or dummy variable.
In addition how can you define the class of reference because param=ref and ref=first do not exist with glimmix proc. Thank in advance.
Gwénaël
If you created the dummy variables properly, then the only difference is the model is parameterized differently, and so it is possible that the intercept changes, but the model is equivalent when the difference in parameterization is taken into account, and your predicted values should also be the same.
Your three examples seem to be providing the exact same estimates once this difference in parameterization is taken into account
Model 1
If Trt_new=0, then the effect is –1.4075–0.1615 = –1.569
Model 2
If Trt1=1 and trt2=0, which is the same as Trt_new=0 then the effect is –1.7369+0+0.1679 = –1.569, exact same effect
In addition to the models you present (which has shown to be equivalent), you could use other options (REF=) in the CLASS statement to set which level is a reference group.
I prefer to let the PROC generate what I need, using the CLASS statement, as opposed to hand coding dummy variables. Much less chance of a surprise due to improper levels (at least for me).
Steve Denham
For the model 2, I used a order=formatted to obtain the same results because REF did not exist in proc glimmix SAS 9.3.
I used dummy variables for treatment variable because it seems that the first following code is faster and converge compared to removing the 2 randoms statement by one (which not converge).
proc glimmix data=temp;
class fu study meta;
model status = fu study trt1 trt2 / dist=poisson link=log offset=ln_y s;
random trt1 / subject=study type=vc;
random trt2 / subject=study type=vc;
covtest ZeroG;
run;
proc glimmix data=temp;
class fu study meta trt_new;
model status = fu study trt_new/ dist=poisson link=log offset=ln_y s;
random trt_new / subject=study type=vc group=meta;
covtest ZeroG;
run;
The objective is to insert a rando effect for treatment effect (A/B) in metaanalysis 1 and a random effect for treatment effect (A/C) in metaanalysis 2.
I am trying the following options to resolve the issue of convergent but it does not work in the second program.
nloptions technique=congra maxiter=1000 gconv=1e-4;
Could you tell me what method used for the estimation in method=PL, QUAD, RSMPL, ....
thanks for your advice.
Gwénaël
Its hard to see what this (except for your first sentence) has to do with your original question.
Could you tell me what method used for the estimation in method=PL, QUAD, RSMPL, ....
Actually, no I cannot, but the SAS Help files have extensive documentation on these methods.
For the models that you are examining, use METHOD=QUAD
The code I would try first would be:
proc glimmix data=temp method=quad;
class fu study meta trt_new;
model status = fu study trt_new/ dist=poisson link=log offset=ln_y s;
random trt_new / subject=study type=vc group=meta;
covtest ZeroG;
run;
Steve Denham
Thank you very much. I agree that the second question about the convergence is not directly related to the first question but it seem that the coding (class statement or dummy variable) impact the modelling. As suggested by Steve Denham I used
1. method=quad without noptions but the following message occurs
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: At least one element of the gradient is greater than 1e-3.
NOTE: Estimated G matrix is not positive definite.
and the results of covariance parameter are
Covariance Parameter Estimates
Cov Standard
Parm Subject Group Estimate Error
trt_new study meta 1 0 .
trt_new study meta 2 0 .
2. if I add the noptions technique=congra maxiter=1000 gconv=1e-4;
the following message occurs
ERROR: Floating Point Overflow.
ERROR: Termination due to Floating Point Exception
3. while the following program
proc glimmix data=temp maxopt=400 pconv=1e-4
class fu study meta;
model status = fu study trt1 trt2 / dist=poisson link=log offset=ln_y s ;
random trt1 / subject=study type=vc;
random trt2 / subject=study type=vc;
nloptions technique=congra maxiter=1000 gconv=1e-4;
covtest ZeroG;
run;
gives
Covariance Parameter Estimates
Cov Standard
Parm Subject Estimate Error
trt1 study 0.02702 0.03036
trt2 study 0.01845 0.009520
Thank you for your help.
Gwénaël
Thanks, Gwénaël. I have added something new to my toolbox when fighting nonconvergence problems! This could become quite helpful.
Steve Denham
Steve,
How can I access to your toolbox.
Gwénaël
Brain surgery, I suppose.
I should have been clearer--my toolbox is figurative, not literal. A lot of it consists of persistence with Google, reading SAS-L daily for <mumble-mumble> years and just trying different things with existing datasets, where I have a clear idea of how the data were generated. And new tools come from reading this forum, seeing folks ask questions, trying to answer them, and then seeing how they take answers and use them to interpret what they are doing.
Steve Denham
Is it possible to transfer some information from your brain to my brain...
yes I understand. I read different exchanges between you and other people about the different issues of convergence when using proc glimmix.
Thank you again
Gwénaël
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.