Re: poisson regression using sas

alphabeta · Posted 08-07-2015 09:38 PM

Good evening,

I have two questions when translating stata codes to sas codes.

First, for poisson regression with longitudinal panel data, I tried proc glimmix in sas but get different coefficient estimates from xtpoisson command from stata.

Can anyone give me some suggestions?

Stata code:

xtset patient visit

#delimit ;
xtpoisson y x1 x2 x3 x4 x5 x6 x7 x8 x9 x10, exposure(visit) normal;
#delimit cr

SAS code:

PROC GLIMMIX METHOD=QUAD(QPOINTS=50) DATA=data_a;
CLASS patient;
MODEL y = x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 / DIST=POISSON OFFSET=logvisit LINK=LOG S;
RANDOM INTERCEPT / SUBJECT=patient TYPE=UN;
RUN;

Second, I also fit a cross-sectional simple poisson regression to get predicted counts.

However, stata and sas seems to produce the predcited value differently.

Based on the code below, can anyone give me some hints how to get the same predicted "lnlm" in sas?

Stata code:

#delimit ;
xi: poisson y x1 x2 x3 x4 x5 x6 x7 x8 x9 x10, exposure(visit);
#delimit cr

predict lnlm, xb nooffset

SAS code:
PROC GENMOD DATA=data_b;
CLASS patient;
MODEL y = x1 x2 x3 x4 x5 x6 x7 x8 x9 x10/ DIST=POISSON OFFSET=logvisit LINK=LOG TYPE3 WALD;
output out=out_b p=pred;
RUN;

Thanks!!

JacobSimonsen · Posted 08-08-2015 11:21 AM

It is likely that the difference is due to the covariance structure. You used a unstructured covariance matrix, but I would guess that STATA use the varicance-component structure as default, as SAS also do.

Can you write your model in mathematical terms, not all here understand STATA code.

alphabeta · Posted 08-10-2015 07:22 PM

Hi Jacob,

I tried different covariance structure options, but the coefficients are still different.

I was only given the stata codes, but I assume the original coder wants to fit a random effect poisson regression with an exposure variable, and the random effect follows a normal distribution based on the code.

Thanks!

alphabeta

ets_kps · Posted 08-10-2015 01:58 PM

Hi alphabeta,

You might want to try this. SAS/ETS(R) 14.1 User's Guide

The COUNTREG procedure in ETS has panel data poisson regression which is likely similar to the Stata xtpoisson command.

try

proc countreg data=a groupid=id;

model y = x1-x10 / errorcomp=fixed dist=poisson; run;

Let me know if this helps. -Ken

alphabeta · Posted 08-10-2015 07:24 PM

Hi Ken,

Thanks for your method.

I tried this method with both fixed and random effect, but the coefficients are still not close either.

Thanks!

alphabeta

alphabeta · Posted 08-10-2015 07:36 PM

Hi all,

For the predict part, I figured out lnlm=log(pred/visit), pred=exp(lnlm)*visit

I didn't divide back the exposure variable, that is why stata and sas were different.

For the coefficients, somehow fitting simple poisson regression was not a problem, but fitting a random effect poisson regression using stata and sas gives me very different coefficient estimates.

Still not having a clue yet.

alphabeta

ets_kps · Posted 08-11-2015 04:46 PM

I think the coefficients are different because of your use of the exposure() options. If you run the following example you will find that the SAS and Stata coefficients are exactly the same.

data ships;

input ship time yr_con yr_op service accident op co_65_69 co_70_74 co_75_79

weight;

int=1;

cards;

1 1 1 1 127 0 0 0 0 0 3

1 2 1 2 63 0 1 0 0 0 4

1 3 2 1 1095 3 0 1 0 0 4

1 4 2 2 1095 4 1 1 0 0 3

1 5 3 1 1512 6 0 0 1 0 6

1 6 3 2 3353 18 1 0 1 0 7

1 7 4 1 . . 0 0 0 1 8

1 8 4 2 2244 11 1 0 0 1 7

2 1 1 1 44882 39 0 0 0 0 4

2 2 1 2 17176 29 1 0 0 0 5

2 3 2 1 28609 58 0 1 0 0 3

2 4 2 2 20370 53 1 1 0 0 2

2 5 3 1 7064 12 0 0 1 0 3

2 6 3 2 13099 44 1 0 1 0 4

2 7 4 1 . . 0 0 0 1 6

2 8 4 2 7117 18 1 0 0 1 5

3 1 1 1 1179 1 0 0 0 0 4

3 2 1 2 552 1 1 0 0 0 3

3 3 2 1 781 0 0 1 0 0 2

3 4 2 2 676 1 1 1 0 0 1

3 5 3 1 783 6 0 0 1 0 3

3 6 3 2 1948 2 1 0 1 0 4

3 7 4 1 . . 0 0 0 1 5

3 8 4 2 274 1 1 0 0 1 3

4 1 1 1 251 0 0 0 0 0 2

4 2 1 2 105 0 1 0 0 0 1

4 3 2 1 288 0 0 1 0 0 5

4 4 2 2 192 0 1 1 0 0 5

4 5 3 1 349 2 0 0 1 0 4

4 6 3 2 1208 11 1 0 1 0 3

4 7 4 1 . . 0 0 0 1 2

4 8 4 2 2051 4 1 0 0 1 4

5 1 1 1 45 0 0 0 0 0 5

5 2 1 2 . . 1 0 0 0 6

5 3 2 1 789 7 0 1 0 0 7

5 4 2 2 437 7 1 1 0 0 8

5 5 3 1 1157 5 0 0 1 0 9

5 6 3 2 2161 12 1 0 1 0 6

5 7 4 1 . . 0 0 0 1 4

5 8 4 2 542 1 1 0 0 1 3

;

data two;

set ships;

lnservice = log(service);

run;

proc countreg data=two dist=poisson groupid=ship;

model accident=op co_65_69 co_70_74 co_75_79/ errorcomp=fixed offset=lnservice;

run;

proc countreg data=two dist=negbin groupid=ship;

model accident=op co_65_69 co_70_74 co_75_79/ errorcomp=fixed offset=lnservice;

run;

*poisson with RE and no exposure(service) option;

proc countreg data=two dist=poisson groupid=ship;

model accident=op co_65_69 co_70_74 co_75_79/ errorcomp=random /*offset=lnservice*/;

run;

*Poisson with RE and exposure(service);

proc countreg data=two dist=poisson groupid=ship ;

model accident=op co_65_69 co_70_74 co_75_79/ errorcomp=random offset=lnservice;

run;

proc export data=two outfile="C:\Public\two.dta"; run;

/*open stata and run this

use "C:\Public\two.dta", clear

xtset ship

*coefficients match the "Poisson with RE and no exposure(service) option" exactly

xtpoisson accident op co_65_69 co_70_74 co_75_79, re

estimates store re

*coefficients matches the "Poisson with RE and exposure(service)" exactly

xtpoisson accident op co_65_69 co_70_74 co_75_79, exposure(service) re

estimates store re_wos

outreg2 [re re_wos] using compare, addstat(log-like, `e(ll)')

alphabeta · Posted 08-11-2015 11:39 PM

Hi Ken,

Somehow running sas 9.3 gives me error message as follows:

271 proc countreg data=two dist=poisson groupid=ship ;

-------

22

76

ERROR 22-322: Syntax error, expecting one of the following: ;, ABSFCONV, ABSGCONV, CORRB, COVB,

COVEST, COVOUT, DATA, DIST, FCONV, GCONV, ITPRINT, MAXITER, METHOD, NOPRINT,

OUTEST, PRINTALL, SEED, TYPE.

ERROR 76-322: Syntax error, statement will be ignored.

If change the codes as

proc countreg data=two dist=poisson;
by ship ;

then sas gives me another error message

284 proc countreg data=two dist=poisson;

285 by ship ;

286 model accident=op co_65_69 co_70_74 co_75_79/ errorcomp=random offset=lnservice;

---------

22

76

ERROR 22-322: Syntax error, expecting one of the following: ;, CORRB, COVB, COVEST, COVOUT,

DIST, ITPRINT, METHOD, NOINT, NOPRINT, OFFSET, OUTEST, PRINTALL, TYPE.

ERROR 76-322: Syntax error, statement will be ignored.

Can you take a look for me? Thanks!!

ets_kps · Posted 08-12-2015 10:54 AM

alphabeta,

Our apologies. I was using syntax that we introduced in a more recent version. Please try the following modifications.

remove the groupid=

include a id statement

change countreg to TCOUNTREG

For example

*poisson with RE and no exposure(service) option;

proc tcountreg data=two dist=poisson ;

id ship;

model accident=op co_65_69 co_70_74 co_75_79/ errorcomp=random /*offset=lnservice*/;

run;

*Poisson with RE and exposure(service);

proc tcountreg data=two dist=poisson;

id=ship;

model accident=op co_65_69 co_70_74 co_75_79/ errorcomp=random offset=lnservice;

run;

I have run this vs old syntax and get the same estimates from older to newer versions. ==> it will still match Stata's coefficients.

Good luck-Ken

alphabeta · Posted 08-12-2015 04:14 PM

Hi Ken,

I ran both sas 9.3 and sas 9.4 today, but the coefficients are still different from the stata command given.

The codes and results are as follows

Stata codes:

xtset patient visit

#delimit ;
xtpoisson y x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16, exposure(visit) normal;
#delimit cr

Stata results

Fitting comparison Poisson model:

Iteration 0: log likelihood = -13626.58

Iteration 1: log likelihood = -13626.554

Iteration 2: log likelihood = -13626.554

Fitting full model:

tau = 0.0 log likelihood = -13626.554

tau = 0.1 log likelihood = -13563.763

tau = 0.2 log likelihood = -13549.87

tau = 0.3 log likelihood = -13562.89

Iteration 0: log likelihood = -13549.935

Iteration 1: log likelihood = -13547.007

Iteration 2: log likelihood = -13546.996

Iteration 3: log likelihood = -13546.996

Random-effects Poisson regression Number of obs

> =

> 100000

Group variable: patient Number of groups

> =

> 2391

Random effects u_i ~ Gaussian Obs per group: min

> =

> 1

avg

> =

> 41.8

max

> =

> 231

Integration method: mvaghermite Integration points

> =

> 12

Wald chi2(16)

> =

> 308.08

Log likelihood = -13546.996 Prob > chi2

> =

> 0.0000

-------------------------------------------------------------------

> -----------

y | Coef. Std. Err. z P>|z| [95% Conf

> . Interval]

-------------+-----------------------------------------------------

> -----------

x1 | -.0023401 .0019663 -1.19 0.234 -.0061939

> .0015137

x2 | .0791925 .0509703 1.55 0.120 -.0207075

> .1790925

x3 | -.0556415 .0857818 -0.65 0.517 -.2237707

> .1124877

x4 | .2604183 .1530857 1.70 0.089 -.0396241

> .5604608

x5| .0260236 .1988315 0.13 0.896 -.363679

> .4157261

x6 | .0096076 .0558143 0.17 0.863 -.0997864

> .1190016

x7 | .8417932 .0595242 14.14 0.000 .7251279

> .9584586

x8 | -.0860519 .0591174 -1.46 0.146 -.2019199

> .029816

x9 | -.3834076 .0742257 -5.17 0.000 -.5288874

> -.2379278

x10| .0169214 .0610793 0.28 0.782 -.1027917

> .1366346

x11 | -.2420708 .0549503 -4.41 0.000 -.3497714

> -.1343701

x12 | -.3106347 .0593382 -5.23 0.000 -.4269354

> -.194334

x13 | -.1306361 .0632131 -2.07 0.039 -.2545315

> -.0067407

x14| -.1735819 .0566482 -3.06 0.002 -.2846104

> -.0625535

x15 | -.2393673 .0561851 -4.26 0.000 -.3494881

> -.1292465

x16 | -.0680811 .0809386 -0.84 0.400 -.2267179

> .0905557

_cons | -7.036347 .1595935 -44.09 0.000 -7.349144

> -6.723549

ln(visit) | 1 (exposure)

-------------+-----------------------------------------------------

> -----------

/lnsig2u | -1.364564 .1253317 -10.89 0.000 -1.610209

> -1.118918

-------------+-----------------------------------------------------

> -----------

sigma_u | .5054623 .0316752 .4470412

> .5715182

-------------------------------------------------------------------

> -----------

Likelihood-ratio test of sigma_u=0: chibar2(01) = 159.11 Pr>=chib

> ar2 = 0.000

SAS 9.4 codes

proc countreg data=out_sample1 dist=poisson groupid=patient ;
model y = x1 x2 x3 x4 x5x 6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 / errorcomp=random offset=logvisit;
run;

SAS 9.4 results

The COUNTREG Procedure

Model Fit Summary

Dependent Variable y

Number of Observations 100000

Data Set WORK.OUT_SAMPLE1

Model Poisson

Error Component Random

Number of Cross Sections 2391

Offset Variable logvisit

Log Likelihood -13554

Maximum Absolute Gradient 9.56066E-7

Number of Iterations 15

Optimization Method Newton-Raphson

AIC 27143

SBC 27315

Algorithm converged.

Parameter Estimates

Parameter DF Estimate Standard Error t Value Pr > |t|

Intercept 1 -6.899390 0.156789 -44.00 <.0001

x1 1 -0.002426 0.001927 -1.26 0.2079

x2 1 0.091417 0.050119 1.82 0.0682

x3 1 -0.067597 0.084548 -0.80 0.4240

x4 1 0.267802 0.154460 1.73 0.0830

x5 1 0.039801 0.197891 0.20 0.8406

x6 1 0.003591 0.054764 0.07 0.9477

x7 1 0.848595 0.058601 14.48 <.0001

x8 1 -0.085057 0.058143 -1.46 0.1435

x9 1 -0.384157 0.072340 -5.31 <.0001

x10 1 0.022466 0.060081 0.37 0.7085

x11 1 -0.253048 0.054064 -4.68 <.0001

x12 1 -0.320980 0.058543 -5.48 <.0001

x13 1 -0.146113 0.062766 -2.33 0.0199

x14 1 -0.172344 0.055604 -3.10 0.0019

x15 -0.241250 0.055214 -4.37 <.0001

x16 1 -0.074001 0.079395 -0.93 0.3513

_Alpha 1 0.229979 0.028888 7.96 <.0001

SAS 9.3 codes

proc tcountreg data=out_sample1 dist=poisson;
id patient;
model y= x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 / errorcomp=random offset=logvisit;
run;

SAS 9.3 results

The TCOUNTREG Procedure

Model Fit Summary

Dependent Variable y

Number of Observations 100000

Data Set WORK.OUT_SAMPLE1

Model Poisson

Error Component Random

Offset Variable logvisit

Log Likelihood -13557

Maximum Absolute Gradient 1.12475E-6

Number of Iterations 14

Optimization Method Newton-Raphson

AIC 27151

SBC 27322

Algorithm converged.

Parameter Estimates

Parameter DF Estimate Standard Error t Value Pr > |t|

Intercept 1 -6.899832 0.153968 -44.81 <.0001

x1 1 -0.002457 0.001891 -1.30 0.1938

x2 1 0.086361 0.049271 1.75 0.0796

x3 1 -0.070813 0.083061 -0.85 0.3939

x4 1 0.263515 0.152089 1.73 0.0832

x5 1 0.030784 0.195286 0.16 0.8747

x6 1 -0.000162 0.053809 -0.00 0.9976

x7 1 0.837899 0.057722 14.52 <.0001

x8 1 -0.084471 0.057015 -1.48 0.1385

x9 1 -0.373889 0.071050 -5.26 <.0001

x10 1 0.019545 0.058936 0.33 0.7402

x11 1 -0.246067 0.053034 -4.64 <.0001

x12 1 -0.315442 0.057560 -5.48 <.0001

x13 1 -0.141599 0.061752 -2.29 0.0218

x14 1 -0.175507 0.054645 -3.21 0.0013

x15 1 -0.241094 0.054212 -4.45 <.0001

x16 1 -0.080489 0.077757 -1.04 0.3006

_Alpha 1 0.222232 0.028434 7.82 <.0001

Is this difference due to "unbalanced"panel data?

I noticed stata results indicate panel variable: patient (unbalanced).

Thanks!!

alphabeta

ets_kps · Posted 08-13-2015 11:52 AM

It is possible that it is due to an imbalance. You could check this by deleting a couple time obs from a cross section and then checking against my code i provided.

I am afraid that if you need any more assistance, you will have to contact Tech Support and open a formal track.

1-800-727-0025

Good luck-Ken

SAS Innovate 2025: Save the Date