Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- PROC GENMOD and BAYES statement with EXP option: discrepancy between L...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 07-26-2017 10:52 AM
(1773 views)

In PROC GENMOD (v9.4 TS 1M3, x64 for Windows), why is there a discrepancy between the exponentiated estimates as output using the EXP option in the LSMEANS statement and the calculated values (i.e. doing the exponentiation calculation in a data step) when the BAYES statement is used, but not when a frequentist analysis is done? The values differ very slightly in the Bayesian analysis. Am I missing something?

Here is some code, run on an example in the book by Stokes, Davis and Koch (p. 354).

data respire;

do region=1,2;

do agegp=2,3,4,5,6,1;

input cases total;

ltotal=log(total);

output;

end;

end;

datalines;

75 220407

68 198119

63 134084

45 70708

27 34233

64 1074246

76 564535

98 592983

104 450740

63 270908

80 161850

61 2880262

run;

proc format;

value agfmt 1='<35'

2='35-44'

3='45-54'

4='55-64'

5='65-74'

6='>75'

;

value regfmt 1='South'

2='North'

;

run;

proc print data=respire;

format agegp agfmt. region regfmt.;

run;

/* Frequentist*/

ods output diffs=frediffs_all ;

proc genmod data=respire;

class agegp region;

model cases=agegp region/dist=poisson link=log offset=ltotal;

lsmeans agegp /diff=control("1") exp;

lsmeans region /diff=control("1") exp;

title 'Koch p 354 Frequentist Poisson analysis';

run;

ods output close;

/* Bayesian*/

ods output diffs=baydiffs_all ;

proc genmod data=respire;

class agegp region;

model cases=agegp region/dist=poisson link=log offset=ltotal;

bayes seed=555 STATS(alpha=0.05 percent=2.5 25 50 75 97.5 )=all nbi=2000 nmc=300000 diagnostics=(autocorr ess gelman(nchain=3)) ;

lsmeans agegp /diff=control("1") exp;

lsmeans region /diff=control("1") exp;

title 'Koch p 354 Bayesian Poisson analysis';

run;

quit;

ods output close;

proc sort data=frediffs_all;

by agegp region;

run;

proc sort data=baydiffs_all;

by agegp region;

run;

/* Compare exponentiated (as calculated) vs. printed*/

data allestdiff;

set frediffs_all(in=f)

baydiffs_all(in=b);

by agegp region;

if f then antype='Frequentist';

else if b then antype='Bayesian';

expest=exp(estimate);

label expest='Expomnentiated estimate (calculated)';

run;

options ls=132;

proc compare data=allestdiff ;

id antype agegp region;

var expestimate ;

with expest;

title 'Comparison of exponentiated estimates as output, and calculated';

run;

proc print data=allestdiff label;

format region regfmt. agegp agfmt. estimate expest expestimate 8.6;

id antype;

var agegp region estimate expest expestimate;

title 'Estimates from GENMOD using frequentist and Bayesian analyses : exponentiated as output and as calculated';

run;

6 REPLIES 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Closed form solution (frequentist) vs. monte carlo estimate from random sampling (Bayesian). Change the parameters of your sampling (seed, number of samples, number of chains, etc.) and your monte carlo estimate will change.

Even if the frequentist solution is not closed form (eg requires Newton-Raphson), the algorithm for solving is deterministic and thus will provide the same solution each time.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Ahh, I see. Apologies for the lazy reply.

Well, with these issues I usually first consider loss due to floating point representation. In this case, is the Bayesian estimate of the odds ratio computed at each iteration? If so, then the procedure does not simply exponentiate the estimate at the end of the sampling, but rather uses a running average. As the average is computed, loss due to floating point representation may cause the observed difference. For the frequentist approach, the odds ratio is exactly exp(estimate).

Just a guess though.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Apology accepted! That's an interesting theory of yours and a possible explanation. I guess we shall have to wait for the guys at SAS to tell us whether that is correct. If it is, then it answers the question but still makes it rather difficult when it comes to presenting results: which should one use?!

If it's not the answer, then what is going on? I agree that floating point issues are often the cause of such things, which is why I did the exponentiation in a data step. I had been using my calculator but then wondered if the use of 4 DP was affecting things, and was interested to see that the discrepancy remained when I used the data step.

.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello my dears,

Can someone provide me with the syntax of lsmeans option showing mean separations in letter using Process genmod

Can someone provide me with the syntax of lsmeans option showing mean separations in letter using Process genmod

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.