turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- PROC GENMOD and BAYES statement with EXP option: d...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

3 weeks ago

In PROC GENMOD (v9.4 TS 1M3, x64 for Windows), why is there a discrepancy between the exponentiated estimates as output using the EXP option in the LSMEANS statement and the calculated values (i.e. doing the exponentiation calculation in a data step) when the BAYES statement is used, but not when a frequentist analysis is done? The values differ very slightly in the Bayesian analysis. Am I missing something?

Here is some code, run on an example in the book by Stokes, Davis and Koch (p. 354).

data respire;

do region=1,2;

do agegp=2,3,4,5,6,1;

input cases total;

ltotal=log(total);

output;

end;

end;

datalines;

75 220407

68 198119

63 134084

45 70708

27 34233

64 1074246

76 564535

98 592983

104 450740

63 270908

80 161850

61 2880262

run;

proc format;

value agfmt 1='<35'

2='35-44'

3='45-54'

4='55-64'

5='65-74'

6='>75'

;

value regfmt 1='South'

2='North'

;

run;

proc print data=respire;

format agegp agfmt. region regfmt.;

run;

/* Frequentist*/

ods output diffs=frediffs_all ;

proc genmod data=respire;

class agegp region;

model cases=agegp region/dist=poisson link=log offset=ltotal;

lsmeans agegp /diff=control("1") exp;

lsmeans region /diff=control("1") exp;

title 'Koch p 354 Frequentist Poisson analysis';

run;

ods output close;

/* Bayesian*/

ods output diffs=baydiffs_all ;

proc genmod data=respire;

class agegp region;

model cases=agegp region/dist=poisson link=log offset=ltotal;

bayes seed=555 STATS(alpha=0.05 percent=2.5 25 50 75 97.5 )=all nbi=2000 nmc=300000 diagnostics=(autocorr ess gelman(nchain=3)) ;

lsmeans agegp /diff=control("1") exp;

lsmeans region /diff=control("1") exp;

title 'Koch p 354 Bayesian Poisson analysis';

run;

quit;

ods output close;

proc sort data=frediffs_all;

by agegp region;

run;

proc sort data=baydiffs_all;

by agegp region;

run;

/* Compare exponentiated (as calculated) vs. printed*/

data allestdiff;

set frediffs_all(in=f)

baydiffs_all(in=b);

by agegp region;

if f then antype='Frequentist';

else if b then antype='Bayesian';

expest=exp(estimate);

label expest='Expomnentiated estimate (calculated)';

run;

options ls=132;

proc compare data=allestdiff ;

id antype agegp region;

var expestimate ;

with expest;

title 'Comparison of exponentiated estimates as output, and calculated';

run;

proc print data=allestdiff label;

format region regfmt. agegp agfmt. estimate expest expestimate 8.6;

id antype;

var agegp region estimate expest expestimate;

title 'Estimates from GENMOD using frequentist and Bayesian analyses : exponentiated as output and as calculated';

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

3 weeks ago - last edited 3 weeks ago

Closed form solution (frequentist) vs. monte carlo estimate from random sampling (Bayesian). Change the parameters of your sampling (seed, number of samples, number of chains, etc.) and your monte carlo estimate will change.

Even if the frequentist solution is not closed form (eg requires Newton-Raphson), the algorithm for solving is deterministic and thus will provide the same solution each time.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

3 weeks ago

No, this is not the point. I am not querying the difference between the results for the Bayeiand and frequentist analyses: I accept that they will not match exactly. I am questioning why the results you get when using the EXP option in LSMEANS differs from a calculated value when conducting a Byesian analysis, but not when using the frequentist analysis. For the frequentist analysis (i.e. normal Genmod without the BAYES statement), the estimates that are output with the EXP option are exactly the same as if you calculate the exponentiated values in a data step. If you use the BAYES statement this is not the case. Why?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

3 weeks ago

Ahh, I see. Apologies for the lazy reply.

Well, with these issues I usually first consider loss due to floating point representation. In this case, is the Bayesian estimate of the odds ratio computed at each iteration? If so, then the procedure does not simply exponentiate the estimate at the end of the sampling, but rather uses a running average. As the average is computed, loss due to floating point representation may cause the observed difference. For the frequentist approach, the odds ratio is exactly exp(estimate).

Just a guess though.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

3 weeks ago

Apology accepted! That's an interesting theory of yours and a possible explanation. I guess we shall have to wait for the guys at SAS to tell us whether that is correct. If it is, then it answers the question but still makes it rather difficult when it comes to presenting results: which should one use?!

If it's not the answer, then what is going on? I agree that floating point issues are often the cause of such things, which is why I did the exponentiation in a data step. I had been using my calculator but then wondered if the use of 4 DP was affecting things, and was interested to see that the discrepancy remained when I used the data step.

.